1095 lines
33 KiB
Plaintext
1095 lines
33 KiB
Plaintext
=head1 NAME
|
|
|
|
Pod::Simple::Subclassing -- write a formatter as a Pod::Simple subclass
|
|
|
|
=head1 SYNOPSIS
|
|
|
|
package Pod::SomeFormatter;
|
|
use Pod::Simple;
|
|
@ISA = qw(Pod::Simple);
|
|
$VERSION = '1.01';
|
|
use strict;
|
|
|
|
sub _handle_element_start {
|
|
my($parser, $element_name, $attr_hash_r) = @_;
|
|
...
|
|
}
|
|
|
|
sub _handle_element_end {
|
|
my($parser, $element_name, $attr_hash_r) = @_;
|
|
# NOTE: $attr_hash_r is only present when $element_name is "over" or "begin"
|
|
# The remaining code excerpts will mostly ignore this $attr_hash_r, as it is
|
|
# mostly useless. It is documented where "over-*" and "begin" events are
|
|
# documented.
|
|
...
|
|
}
|
|
|
|
sub _handle_text {
|
|
my($parser, $text) = @_;
|
|
...
|
|
}
|
|
1;
|
|
|
|
=head1 DESCRIPTION
|
|
|
|
This document is about using Pod::Simple to write a Pod processor,
|
|
generally a Pod formatter. If you just want to know about using an
|
|
existing Pod formatter, instead see its documentation and see also the
|
|
docs in L<Pod::Simple>.
|
|
|
|
B<The zeroeth step> in writing a Pod formatter is to make sure that there
|
|
isn't already a decent one in CPAN. See L<http://search.cpan.org/>, and
|
|
run a search on the name of the format you want to render to. Also
|
|
consider joining the Pod People list
|
|
L<http://lists.perl.org/showlist.cgi?name=pod-people> and asking whether
|
|
anyone has a formatter for that format -- maybe someone cobbled one
|
|
together but just hasn't released it.
|
|
|
|
B<The first step> in writing a Pod processor is to read L<perlpodspec>,
|
|
which contains information on writing a Pod parser (which has been
|
|
largely taken care of by Pod::Simple), but also a lot of requirements
|
|
and recommendations for writing a formatter.
|
|
|
|
B<The second step> is to actually learn the format you're planning to
|
|
format to -- or at least as much as you need to know to represent Pod,
|
|
which probably isn't much.
|
|
|
|
B<The third step> is to pick which of Pod::Simple's interfaces you want to
|
|
use:
|
|
|
|
=over
|
|
|
|
=item Pod::Simple
|
|
|
|
The basic L<Pod::Simple> interface that uses C<_handle_element_start()>,
|
|
C<_handle_element_end()> and C<_handle_text()>.
|
|
|
|
=item Pod::Simple::Methody
|
|
|
|
The L<Pod::Simple::Methody> interface is event-based, similar to that of
|
|
L<HTML::Parser> or L<XML::Parser>'s "Handlers".
|
|
|
|
=item Pod::Simple::PullParser
|
|
|
|
L<Pod::Simple::PullParser> provides a token-stream interface, sort of
|
|
like L<HTML::TokeParser>'s interface.
|
|
|
|
=item Pod::Simple::SimpleTree
|
|
|
|
L<Pod::Simple::SimpleTree> provides a simple tree interface, rather like
|
|
L<XML::Parser>'s "Tree" interface. Users familiar with XML handling will
|
|
be comfortable with this interface. Users interested in outputting XML,
|
|
should look into the modules that produce an XML representation of the
|
|
Pod stream, notably L<Pod::Simple::XMLOutStream>; you can feed the output
|
|
of such a class to whatever XML parsing system you are most at home with.
|
|
|
|
=back
|
|
|
|
B<The last step> is to write your code based on how the events (or tokens,
|
|
or tree-nodes, or the XML, or however you're parsing) will map to
|
|
constructs in the output format. Also be sure to consider how to escape
|
|
text nodes containing arbitrary text, and what to do with text
|
|
nodes that represent preformatted text (from verbatim sections).
|
|
|
|
|
|
|
|
=head1 Events
|
|
|
|
TODO intro... mention that events are supplied for implicits, like for
|
|
missing >'s
|
|
|
|
In the following section, we use XML to represent the event structure
|
|
associated with a particular construct. That is, an opening tag
|
|
represents the element start, the attributes of that opening tag are
|
|
the attributes given to the callback, and the closing tag represents
|
|
the end element.
|
|
|
|
Three callback methods must be supplied by a class extending
|
|
L<Pod::Simple> to receive the corresponding event:
|
|
|
|
=over
|
|
|
|
=item C<< $parser->_handle_element_start( I<element_name>, I<attr_hashref> ) >>
|
|
|
|
=item C<< $parser->_handle_element_end( I<element_name> ) >>
|
|
|
|
=item C<< $parser->_handle_text( I<text_string> ) >>
|
|
|
|
=back
|
|
|
|
Here's the comprehensive list of values you can expect as
|
|
I<element_name> in your implementation of C<_handle_element_start>
|
|
and C<_handle_element_end>::
|
|
|
|
=over
|
|
|
|
=item events with an element_name of Document
|
|
|
|
Parsing a document produces this event structure:
|
|
|
|
<Document start_line="543">
|
|
...all events...
|
|
</Document>
|
|
|
|
The value of the I<start_line> attribute will be the line number of the first
|
|
Pod directive in the document.
|
|
|
|
If there is no Pod in the given document, then the
|
|
event structure will be this:
|
|
|
|
<Document contentless="1" start_line="543">
|
|
</Document>
|
|
|
|
In that case, the value of the I<start_line> attribute will not be meaningful;
|
|
under current implementations, it will probably be the line number of the
|
|
last line in the file.
|
|
|
|
=item events with an element_name of Para
|
|
|
|
Parsing a plain (non-verbatim, non-directive, non-data) paragraph in
|
|
a Pod document produces this event structure:
|
|
|
|
<Para start_line="543">
|
|
...all events in this paragraph...
|
|
</Para>
|
|
|
|
The value of the I<start_line> attribute will be the line number of the start
|
|
of the paragraph.
|
|
|
|
For example, parsing this paragraph of Pod:
|
|
|
|
The value of the I<start_line> attribute will be the
|
|
line number of the start of the paragraph.
|
|
|
|
produces this event structure:
|
|
|
|
<Para start_line="129">
|
|
The value of the
|
|
<I>
|
|
start_line
|
|
</I>
|
|
attribute will be the line number of the first Pod directive
|
|
in the document.
|
|
</Para>
|
|
|
|
=item events with an element_name of B, C, F, or I.
|
|
|
|
Parsing a BE<lt>...E<gt> formatting code (or of course any of its
|
|
semantically identical syntactic variants
|
|
S<BE<lt>E<lt> ... E<gt>E<gt>>,
|
|
or S<BE<lt>E<lt>E<lt>E<lt> ... E<gt>E<gt>E<gt>E<gt>>, etc.)
|
|
produces this event structure:
|
|
|
|
<B>
|
|
...stuff...
|
|
</B>
|
|
|
|
Currently, there are no attributes conveyed.
|
|
|
|
Parsing C, F, or I codes produce the same structure, with only a
|
|
different element name.
|
|
|
|
If your parser object has been set to accept other formatting codes,
|
|
then they will be presented like these B/C/F/I codes -- i.e., without
|
|
any attributes.
|
|
|
|
=item events with an element_name of S
|
|
|
|
Normally, parsing an SE<lt>...E<gt> sequence produces this event
|
|
structure, just as if it were a B/C/F/I code:
|
|
|
|
<S>
|
|
...stuff...
|
|
</S>
|
|
|
|
However, Pod::Simple (and presumably all derived parsers) offers the
|
|
C<nbsp_for_S> option which, if enabled, will suppress all S events, and
|
|
instead change all spaces in the content to non-breaking spaces. This is
|
|
intended for formatters that output to a format that has no code that
|
|
means the same as SE<lt>...E<gt>, but which has a code/character that
|
|
means non-breaking space.
|
|
|
|
=item events with an element_name of X
|
|
|
|
Normally, parsing an XE<lt>...E<gt> sequence produces this event
|
|
structure, just as if it were a B/C/F/I code:
|
|
|
|
<X>
|
|
...stuff...
|
|
</X>
|
|
|
|
However, Pod::Simple (and presumably all derived parsers) offers the
|
|
C<nix_X_codes> option which, if enabled, will suppress all X events
|
|
and ignore their content. For formatters/processors that don't use
|
|
X events, this is presumably quite useful.
|
|
|
|
|
|
=item events with an element_name of L
|
|
|
|
Because the LE<lt>...E<gt> is the most complex construct in the
|
|
language, it should not surprise you that the events it generates are
|
|
the most complex in the language. Most of complexity is hidden away in
|
|
the attribute values, so for those of you writing a Pod formatter that
|
|
produces a non-hypertextual format, you can just ignore the attributes
|
|
and treat an L event structure like a formatting element that
|
|
(presumably) doesn't actually produce a change in formatting. That is,
|
|
the content of the L event structure (as opposed to its
|
|
attributes) is always what text should be displayed.
|
|
|
|
There are, at first glance, three kinds of L links: URL, man, and pod.
|
|
|
|
When a LE<lt>I<some_url>E<gt> code is parsed, it produces this event
|
|
structure:
|
|
|
|
<L content-implicit="yes" raw="that_url" to="that_url" type="url">
|
|
that_url
|
|
</L>
|
|
|
|
The C<type="url"> attribute is always specified for this type of
|
|
L code.
|
|
|
|
For example, this Pod source:
|
|
|
|
L<http://www.perl.com/CPAN/authors/>
|
|
|
|
produces this event structure:
|
|
|
|
<L content-implicit="yes" raw="http://www.perl.com/CPAN/authors/" to="http://www.perl.com/CPAN/authors/" type="url">
|
|
http://www.perl.com/CPAN/authors/
|
|
</L>
|
|
|
|
When a LE<lt>I<manpage(section)>E<gt> code is parsed (and these are
|
|
fairly rare and not terribly useful), it produces this event structure:
|
|
|
|
<L content-implicit="yes" raw="manpage(section)" to="manpage(section)" type="man">
|
|
manpage(section)
|
|
</L>
|
|
|
|
The C<type="man"> attribute is always specified for this type of
|
|
L code.
|
|
|
|
For example, this Pod source:
|
|
|
|
L<crontab(5)>
|
|
|
|
produces this event structure:
|
|
|
|
<L content-implicit="yes" raw="crontab(5)" to="crontab(5)" type="man">
|
|
crontab(5)
|
|
</L>
|
|
|
|
In the rare cases where a man page link has a section specified, that text appears
|
|
in a I<section> attribute. For example, this Pod source:
|
|
|
|
L<crontab(5)/"ENVIRONMENT">
|
|
|
|
will produce this event structure:
|
|
|
|
<L content-implicit="yes" raw="crontab(5)/"ENVIRONMENT"" section="ENVIRONMENT" to="crontab(5)" type="man">
|
|
"ENVIRONMENT" in crontab(5)
|
|
</L>
|
|
|
|
In the rare case where the Pod document has code like
|
|
LE<lt>I<sometext>|I<manpage(section)>E<gt>, then the I<sometext> will appear
|
|
as the content of the element, the I<manpage(section)> text will appear
|
|
only as the value of the I<to> attribute, and there will be no
|
|
C<content-implicit="yes"> attribute (whose presence means that the Pod parser
|
|
had to infer what text should appear as the link text -- as opposed to
|
|
cases where that attribute is absent, which means that the Pod parser did
|
|
I<not> have to infer the link text, because that L code explicitly specified
|
|
some link text.)
|
|
|
|
For example, this Pod source:
|
|
|
|
L<hell itself!|crontab(5)>
|
|
|
|
will produce this event structure:
|
|
|
|
<L raw="hell itself!|crontab(5)" to="crontab(5)" type="man">
|
|
hell itself!
|
|
</L>
|
|
|
|
The last type of L structure is for links to/within Pod documents. It is
|
|
the most complex because it can have a I<to> attribute, I<or> a
|
|
I<section> attribute, or both. The C<type="pod"> attribute is always
|
|
specified for this type of L code.
|
|
|
|
In the most common case, the simple case of a LE<lt>podpageE<gt> code
|
|
produces this event structure:
|
|
|
|
<L content-implicit="yes" raw="podpage" to="podpage" type="pod">
|
|
podpage
|
|
</L>
|
|
|
|
For example, this Pod source:
|
|
|
|
L<Net::Ping>
|
|
|
|
produces this event structure:
|
|
|
|
<L content-implicit="yes" raw="Net::Ping" to="Net::Ping" type="pod">
|
|
Net::Ping
|
|
</L>
|
|
|
|
In cases where there is link-text explicitly specified, it
|
|
is to be found in the content of the element (and not the
|
|
attributes), just as with the LE<lt>I<sometext>|I<manpage(section)>E<gt>
|
|
case discussed above. For example, this Pod source:
|
|
|
|
L<Perl Error Messages|perldiag>
|
|
|
|
produces this event structure:
|
|
|
|
<L raw="Perl Error Messages|perldiag" to="perldiag" type="pod">
|
|
Perl Error Messages
|
|
</L>
|
|
|
|
In cases of links to a section in the current Pod document,
|
|
there is a I<section> attribute instead of a I<to> attribute.
|
|
For example, this Pod source:
|
|
|
|
L</"Member Data">
|
|
|
|
produces this event structure:
|
|
|
|
<L content-implicit="yes" raw="/"Member Data"" section="Member Data" type="pod">
|
|
"Member Data"
|
|
</L>
|
|
|
|
As another example, this Pod source:
|
|
|
|
L<the various attributes|/"Member Data">
|
|
|
|
produces this event structure:
|
|
|
|
<L raw="the various attributes|/"Member Data"" section="Member Data" type="pod">
|
|
the various attributes
|
|
</L>
|
|
|
|
In cases of links to a section in a different Pod document,
|
|
there are both a I<section> attribute and a L<to> attribute.
|
|
For example, this Pod source:
|
|
|
|
L<perlsyn/"Basic BLOCKs and Switch Statements">
|
|
|
|
produces this event structure:
|
|
|
|
<L content-implicit="yes" raw="perlsyn/"Basic BLOCKs and Switch Statements"" section="Basic BLOCKs and Switch Statements" to="perlsyn" type="pod">
|
|
"Basic BLOCKs and Switch Statements" in perlsyn
|
|
</L>
|
|
|
|
As another example, this Pod source:
|
|
|
|
L<SWITCH statements|perlsyn/"Basic BLOCKs and Switch Statements">
|
|
|
|
produces this event structure:
|
|
|
|
<L raw="SWITCH statements|perlsyn/"Basic BLOCKs and Switch Statements"" section="Basic BLOCKs and Switch Statements" to="perlsyn" type="pod">
|
|
SWITCH statements
|
|
</L>
|
|
|
|
Incidentally, note that we do not distinguish between these syntaxes:
|
|
|
|
L</"Member Data">
|
|
L<"Member Data">
|
|
L</Member Data>
|
|
L<Member Data> [deprecated syntax]
|
|
|
|
That is, they all produce the same event structure (for the most part), namely:
|
|
|
|
<L content-implicit="yes" raw="$depends_on_syntax" section="Member Data" type="pod">
|
|
"Member Data"
|
|
</L>
|
|
|
|
The I<raw> attribute depends on what the raw content of the C<LE<lt>E<gt>> is,
|
|
so that is why the event structure is the same "for the most part".
|
|
|
|
If you have not guessed it yet, the I<raw> attribute contains the raw,
|
|
original, unescaped content of the C<LE<lt>E<gt>> formatting code. In addition
|
|
to the examples above, take notice of the following event structure produced
|
|
by the following C<LE<lt>E<gt>> formatting code.
|
|
|
|
L<click B<here>|page/About the C<-M> switch>
|
|
|
|
<L raw="click B<here>|page/About the C<-M> switch" section="About the -M switch" to="page" type="pod">
|
|
click B<here>
|
|
</L>
|
|
|
|
Specifically, notice that the formatting codes are present and unescaped
|
|
in I<raw>.
|
|
|
|
There is a known bug in the I<raw> attribute where any surrounding whitespace
|
|
is condensed into a single ' '. For example, given LE<60> linkE<62>, I<raw>
|
|
will be " link".
|
|
|
|
=item events with an element_name of E or Z
|
|
|
|
While there are Pod codes EE<lt>...E<gt> and ZE<lt>E<gt>, these
|
|
I<do not> produce any E or Z events -- that is, there are no such
|
|
events as E or Z.
|
|
|
|
=item events with an element_name of Verbatim
|
|
|
|
When a Pod verbatim paragraph (AKA "codeblock") is parsed, it
|
|
produces this event structure:
|
|
|
|
<Verbatim start_line="543" xml:space="preserve">
|
|
...text...
|
|
</Verbatim>
|
|
|
|
The value of the I<start_line> attribute will be the line number of the
|
|
first line of this verbatim block. The I<xml:space> attribute is always
|
|
present, and always has the value "preserve".
|
|
|
|
The text content will have tabs already expanded.
|
|
|
|
|
|
=item events with an element_name of head1 .. head4
|
|
|
|
When a "=head1 ..." directive is parsed, it produces this event
|
|
structure:
|
|
|
|
<head1>
|
|
...stuff...
|
|
</head1>
|
|
|
|
For example, a directive consisting of this:
|
|
|
|
=head1 Options to C<new> et al.
|
|
|
|
will produce this event structure:
|
|
|
|
<head1 start_line="543">
|
|
Options to
|
|
<C>
|
|
new
|
|
</C>
|
|
et al.
|
|
</head1>
|
|
|
|
"=head2" through "=head4" directives are the same, except for the element
|
|
names in the event structure.
|
|
|
|
=item events with an element_name of encoding
|
|
|
|
In the default case, the events corresponding to C<=encoding> directives
|
|
are not emitted. They are emitted if C<keep_encoding_directive> is true.
|
|
In that case they produce event structures like
|
|
L</"events with an element_name of head1 .. head4"> above.
|
|
|
|
=item events with an element_name of over-bullet
|
|
|
|
When an "=over ... Z<>=back" block is parsed where the items are
|
|
a bulleted list, it will produce this event structure:
|
|
|
|
<over-bullet indent="4" start_line="543">
|
|
<item-bullet start_line="545">
|
|
...Stuff...
|
|
</item-bullet>
|
|
...more item-bullets...
|
|
</over-bullet fake-closer="1">
|
|
|
|
The attribute I<fake-closer> is only present if it is a true value; it is not
|
|
present if it is a false value. It is shown in the above example to illustrate
|
|
where the attribute is (in the B<closing> tag). It signifies that the C<=over>
|
|
did not have a matching C<=back>, and thus Pod::Simple had to create a fake
|
|
closer.
|
|
|
|
For example, this Pod source:
|
|
|
|
=over
|
|
|
|
=item *
|
|
|
|
Something
|
|
|
|
=back
|
|
|
|
Would produce an event structure that does B<not> have the I<fake-closer>
|
|
attribute, whereas this Pod source:
|
|
|
|
=over
|
|
|
|
=item *
|
|
|
|
Gasp! An unclosed =over block!
|
|
|
|
would. The rest of the over-* examples will not demonstrate this attribute,
|
|
but they all can have it. See L<Pod::Checker>'s source for an example of this
|
|
attribute being used.
|
|
|
|
The value of the I<indent> attribute is whatever value is after the
|
|
"=over" directive, as in "=over 8". If no such value is specified
|
|
in the directive, then the I<indent> attribute has the value "4".
|
|
|
|
For example, this Pod source:
|
|
|
|
=over
|
|
|
|
=item *
|
|
|
|
Stuff
|
|
|
|
=item *
|
|
|
|
Bar I<baz>!
|
|
|
|
=back
|
|
|
|
produces this event structure:
|
|
|
|
<over-bullet indent="4" start_line="10">
|
|
<item-bullet start_line="12">
|
|
Stuff
|
|
</item-bullet>
|
|
<item-bullet start_line="14">
|
|
Bar <I>baz</I>!
|
|
</item-bullet>
|
|
</over-bullet>
|
|
|
|
=item events with an element_name of over-number
|
|
|
|
When an "=over ... Z<>=back" block is parsed where the items are
|
|
a numbered list, it will produce this event structure:
|
|
|
|
<over-number indent="4" start_line="543">
|
|
<item-number number="1" start_line="545">
|
|
...Stuff...
|
|
</item-number>
|
|
...more item-number...
|
|
</over-bullet>
|
|
|
|
This is like the "over-bullet" event structure; but note that the contents
|
|
are "item-number" instead of "item-bullet", and note that they will have
|
|
a "number" attribute, which some formatters/processors may ignore
|
|
(since, for example, there's no need for it in HTML when producing
|
|
an "<UL><LI>...</LI>...</UL>" structure), but which any processor may use.
|
|
|
|
Note that the values for the I<number> attributes of "item-number"
|
|
elements in a given "over-number" area I<will> start at 1 and go up by
|
|
one each time. If the Pod source doesn't follow that order (even though
|
|
it really should!), whatever numbers it has will be ignored (with
|
|
the correct values being put in the I<number> attributes), and an error
|
|
message might be issued to the user.
|
|
|
|
=item events with an element_name of over-text
|
|
|
|
These events are somewhat unlike the other over-*
|
|
structures, as far as what their contents are. When
|
|
an "=over ... Z<>=back" block is parsed where the items are
|
|
a list of text "subheadings", it will produce this event structure:
|
|
|
|
<over-text indent="4" start_line="543">
|
|
<item-text>
|
|
...stuff...
|
|
</item-text>
|
|
...stuff (generally Para or Verbatim elements)...
|
|
<item-text>
|
|
...more item-text and/or stuff...
|
|
</over-text>
|
|
|
|
The I<indent> and I<fake-closer> attributes are as with the other over-* events.
|
|
|
|
For example, this Pod source:
|
|
|
|
=over
|
|
|
|
=item Foo
|
|
|
|
Stuff
|
|
|
|
=item Bar I<baz>!
|
|
|
|
Quux
|
|
|
|
=back
|
|
|
|
produces this event structure:
|
|
|
|
<over-text indent="4" start_line="20">
|
|
<item-text start_line="22">
|
|
Foo
|
|
</item-text>
|
|
<Para start_line="24">
|
|
Stuff
|
|
</Para>
|
|
<item-text start_line="26">
|
|
Bar
|
|
<I>
|
|
baz
|
|
</I>
|
|
!
|
|
</item-text>
|
|
<Para start_line="28">
|
|
Quux
|
|
</Para>
|
|
</over-text>
|
|
|
|
|
|
|
|
=item events with an element_name of over-block
|
|
|
|
These events are somewhat unlike the other over-*
|
|
structures, as far as what their contents are. When
|
|
an "=over ... Z<>=back" block is parsed where there are no items,
|
|
it will produce this event structure:
|
|
|
|
<over-block indent="4" start_line="543">
|
|
...stuff (generally Para or Verbatim elements)...
|
|
</over-block>
|
|
|
|
The I<indent> and I<fake-closer> attributes are as with the other over-* events.
|
|
|
|
For example, this Pod source:
|
|
|
|
=over
|
|
|
|
For cutting off our trade with all parts of the world
|
|
|
|
For transporting us beyond seas to be tried for pretended offenses
|
|
|
|
He is at this time transporting large armies of foreign mercenaries to
|
|
complete the works of death, desolation and tyranny, already begun with
|
|
circumstances of cruelty and perfidy scarcely paralleled in the most
|
|
barbarous ages, and totally unworthy the head of a civilized nation.
|
|
|
|
=back
|
|
|
|
will produce this event structure:
|
|
|
|
<over-block indent="4" start_line="2">
|
|
<Para start_line="4">
|
|
For cutting off our trade with all parts of the world
|
|
</Para>
|
|
<Para start_line="6">
|
|
For transporting us beyond seas to be tried for pretended offenses
|
|
</Para>
|
|
<Para start_line="8">
|
|
He is at this time transporting large armies of [...more text...]
|
|
</Para>
|
|
</over-block>
|
|
|
|
=item events with an element_name of over-empty
|
|
|
|
B<Note: These events are only triggered if C<parse_empty_lists()> is set to a
|
|
true value.>
|
|
|
|
These events are somewhat unlike the other over-* structures, as far as what
|
|
their contents are. When an "=over ... Z<>=back" block is parsed where there
|
|
is no content, it will produce this event structure:
|
|
|
|
<over-empty indent="4" start_line="543">
|
|
</over-empty>
|
|
|
|
The I<indent> and I<fake-closer> attributes are as with the other over-* events.
|
|
|
|
For example, this Pod source:
|
|
|
|
=over
|
|
|
|
=over
|
|
|
|
=back
|
|
|
|
=back
|
|
|
|
will produce this event structure:
|
|
|
|
<over-block indent="4" start_line="1">
|
|
<over-empty indent="4" start_line="3">
|
|
</over-empty>
|
|
</over-block>
|
|
|
|
Note that the outer C<=over> is a block because it has no C<=item>s but still
|
|
has content: the inner C<=over>. The inner C<=over>, in turn, is completely
|
|
empty, and is treated as such.
|
|
|
|
=item events with an element_name of item-bullet
|
|
|
|
See L</"events with an element_name of over-bullet">, above.
|
|
|
|
=item events with an element_name of item-number
|
|
|
|
See L</"events with an element_name of over-number">, above.
|
|
|
|
=item events with an element_name of item-text
|
|
|
|
See L</"events with an element_name of over-text">, above.
|
|
|
|
=item events with an element_name of for
|
|
|
|
TODO...
|
|
|
|
=item events with an element_name of Data
|
|
|
|
TODO...
|
|
|
|
=back
|
|
|
|
|
|
|
|
=head1 More Pod::Simple Methods
|
|
|
|
Pod::Simple provides a lot of methods that aren't generally interesting
|
|
to the end user of an existing Pod formatter, but some of which you
|
|
might find useful in writing a Pod formatter. They are listed below. The
|
|
first several methods (the accept_* methods) are for declaring the
|
|
capabilities of your parser, notably what C<=for I<targetname>> sections
|
|
it's interested in, what extra NE<lt>...E<gt> codes it accepts beyond
|
|
the ones described in the I<perlpod>.
|
|
|
|
=over
|
|
|
|
=item C<< $parser->accept_targets( I<SOMEVALUE> ) >>
|
|
|
|
As the parser sees sections like:
|
|
|
|
=for html <img src="fig1.jpg">
|
|
|
|
or
|
|
|
|
=begin html
|
|
|
|
<img src="fig1.jpg">
|
|
|
|
=end html
|
|
|
|
...the parser will ignore these sections unless your subclass has
|
|
specified that it wants to see sections targeted to "html" (or whatever
|
|
the formatter name is).
|
|
|
|
If you want to process all sections, even if they're not targeted for you,
|
|
call this before you start parsing:
|
|
|
|
$parser->accept_targets('*');
|
|
|
|
=item C<< $parser->accept_targets_as_text( I<SOMEVALUE> ) >>
|
|
|
|
This is like accept_targets, except that it specifies also that the
|
|
content of sections for this target should be treated as Pod text even
|
|
if the target name in "=for I<targetname>" doesn't start with a ":".
|
|
|
|
At time of writing, I don't think you'll need to use this.
|
|
|
|
|
|
=item C<< $parser->accept_codes( I<Codename>, I<Codename>... ) >>
|
|
|
|
This tells the parser that you accept additional formatting codes,
|
|
beyond just the standard ones (I B C L F S X, plus the two weird ones
|
|
you don't actually see in the parse tree, Z and E). For example, to also
|
|
accept codes "N", "R", and "W":
|
|
|
|
$parser->accept_codes( qw( N R W ) );
|
|
|
|
B<TODO: document how this interacts with =extend, and long element names>
|
|
|
|
|
|
=item C<< $parser->accept_directive_as_data( I<directive_name> ) >>
|
|
|
|
=item C<< $parser->accept_directive_as_verbatim( I<directive_name> ) >>
|
|
|
|
=item C<< $parser->accept_directive_as_processed( I<directive_name> ) >>
|
|
|
|
In the unlikely situation that you need to tell the parser that you will
|
|
accept additional directives ("=foo" things), you need to first set the
|
|
parser to treat its content as data (i.e., not really processed at
|
|
all), or as verbatim (mostly just expanding tabs), or as processed text
|
|
(parsing formatting codes like BE<lt>...E<gt>).
|
|
|
|
For example, to accept a new directive "=method", you'd presumably
|
|
use:
|
|
|
|
$parser->accept_directive_as_processed("method");
|
|
|
|
so that you could have Pod lines like:
|
|
|
|
=method I<$whatever> thing B<um>
|
|
|
|
Making up your own directives breaks compatibility with other Pod
|
|
formatters, in a way that using "=for I<target> ..." lines doesn't;
|
|
however, you may find this useful if you're making a Pod superset
|
|
format where you don't need to worry about compatibility.
|
|
|
|
|
|
=item C<< $parser->nbsp_for_S( I<BOOLEAN> ); >>
|
|
|
|
Setting this attribute to a true value (and by default it is false) will
|
|
turn "SE<lt>...E<gt>" sequences into sequences of words separated by
|
|
C<\xA0> (non-breaking space) characters. For example, it will take this:
|
|
|
|
I like S<Dutch apple pie>, don't you?
|
|
|
|
and treat it as if it were:
|
|
|
|
I like DutchE<nbsp>appleE<nbsp>pie, don't you?
|
|
|
|
This is handy for output formats that don't have anything quite like an
|
|
"SE<lt>...E<gt>" code, but which do have a code for non-breaking space.
|
|
|
|
There is currently no method for going the other way; but I can
|
|
probably provide one upon request.
|
|
|
|
|
|
=item C<< $parser->version_report() >>
|
|
|
|
This returns a string reporting the $VERSION value from your module (and
|
|
its classname) as well as the $VERSION value of Pod::Simple. Note that
|
|
L<perlpodspec> requires output formats (wherever possible) to note
|
|
this detail in a comment in the output format. For example, for
|
|
some kind of SGML output format:
|
|
|
|
print OUT "<!-- \n", $parser->version_report, "\n -->";
|
|
|
|
|
|
=item C<< $parser->pod_para_count() >>
|
|
|
|
This returns the count of Pod paragraphs seen so far.
|
|
|
|
|
|
=item C<< $parser->line_count() >>
|
|
|
|
This is the current line number being parsed. But you might find the
|
|
"line_number" event attribute more accurate, when it is present.
|
|
|
|
|
|
=item C<< $parser->nix_X_codes( I<SOMEVALUE> ) >>
|
|
|
|
This attribute, when set to a true value (and it is false by default)
|
|
ignores any "XE<lt>...E<gt>" sequences in the document being parsed.
|
|
Many formats don't actually use the content of these codes, so have
|
|
no reason to process them.
|
|
|
|
=item C<< $parser->keep_encoding_directive( I<SOMEVALUE> ) >>
|
|
|
|
This attribute, when set to a true value (it is false by default)
|
|
will keep C<=encoding> and its content in the event structure. Most
|
|
formats don't actually need to process the content of an C<=encoding>
|
|
directive, even when this directive sets the encoding and the
|
|
processor makes use of the encoding information. Indeed, it is
|
|
possible to know the encoding without processing the directive
|
|
content.
|
|
|
|
=item C<< $parser->merge_text( I<SOMEVALUE> ) >>
|
|
|
|
This attribute, when set to a true value (and it is false by default)
|
|
makes sure that only one event (or token, or node) will be created
|
|
for any single contiguous sequence of text. For example, consider
|
|
this somewhat contrived example:
|
|
|
|
I just LOVE Z<>hotE<32>apple pie!
|
|
|
|
When that is parsed and events are about to be called on it, it may
|
|
actually seem to be four different text events, one right after another:
|
|
one event for "I just LOVE ", one for "hot", one for " ", and one for
|
|
"apple pie!". But if you have merge_text on, then you're guaranteed
|
|
that it will be fired as one text event: "I just LOVE hot apple pie!".
|
|
|
|
|
|
=item C<< $parser->code_handler( I<CODE_REF> ) >>
|
|
|
|
This specifies code that should be called when a code line is seen
|
|
(i.e., a line outside of the Pod). Normally this is undef, meaning
|
|
that no code should be called. If you provide a routine, it should
|
|
start out like this:
|
|
|
|
sub get_code_line { # or whatever you'll call it
|
|
my($line, $line_number, $parser) = @_;
|
|
...
|
|
}
|
|
|
|
Note, however, that sometimes the Pod events aren't processed in exactly
|
|
the same order as the code lines are -- i.e., if you have a file with
|
|
Pod, then code, then more Pod, sometimes the code will be processed (via
|
|
whatever you have code_handler call) before the all of the preceding Pod
|
|
has been processed.
|
|
|
|
|
|
=item C<< $parser->cut_handler( I<CODE_REF> ) >>
|
|
|
|
This is just like the code_handler attribute, except that it's for
|
|
"=cut" lines, not code lines. The same caveats apply. "=cut" lines are
|
|
unlikely to be interesting, but this is included for completeness.
|
|
|
|
|
|
=item C<< $parser->pod_handler( I<CODE_REF> ) >>
|
|
|
|
This is just like the code_handler attribute, except that it's for
|
|
"=pod" lines, not code lines. The same caveats apply. "=pod" lines are
|
|
unlikely to be interesting, but this is included for completeness.
|
|
|
|
|
|
=item C<< $parser->whiteline_handler( I<CODE_REF> ) >>
|
|
|
|
This is just like the code_handler attribute, except that it's for
|
|
lines that are seemingly blank but have whitespace (" " and/or "\t") on them,
|
|
not code lines. The same caveats apply. These lines are unlikely to be
|
|
interesting, but this is included for completeness.
|
|
|
|
|
|
=item C<< $parser->whine( I<linenumber>, I<complaint string> ) >>
|
|
|
|
This notes a problem in the Pod, which will be reported in the "Pod
|
|
Errors" section of the document and/or sent to STDERR, depending on the
|
|
values of the attributes C<no_whining>, C<no_errata_section>, and
|
|
C<complain_stderr>.
|
|
|
|
=item C<< $parser->scream( I<linenumber>, I<complaint string> ) >>
|
|
|
|
This notes an error like C<whine> does, except that it is not
|
|
suppressible with C<no_whining>. This should be used only for very
|
|
serious errors.
|
|
|
|
|
|
=item C<< $parser->source_dead(1) >>
|
|
|
|
This aborts parsing of the current document, by switching on the flag
|
|
that indicates that EOF has been seen. In particularly drastic cases,
|
|
you might want to do this. It's rather nicer than just calling
|
|
C<die>!
|
|
|
|
=item C<< $parser->hide_line_numbers( I<SOMEVALUE> ) >>
|
|
|
|
Some subclasses that indiscriminately dump event attributes (well,
|
|
except for ones beginning with "~") can use this object attribute for
|
|
refraining to dump the "start_line" attribute.
|
|
|
|
=item C<< $parser->no_whining( I<SOMEVALUE> ) >>
|
|
|
|
This attribute, if set to true, will suppress reports of non-fatal
|
|
error messages. The default value is false, meaning that complaints
|
|
I<are> reported. How they get reported depends on the values of
|
|
the attributes C<no_errata_section> and C<complain_stderr>.
|
|
|
|
=item C<< $parser->no_errata_section( I<SOMEVALUE> ) >>
|
|
|
|
This attribute, if set to true, will suppress generation of an errata
|
|
section. The default value is false -- i.e., an errata section will be
|
|
generated.
|
|
|
|
=item C<< $parser->complain_stderr( I<SOMEVALUE> ) >>
|
|
|
|
This attribute, if set to true will send complaints to STDERR. The
|
|
default value is false -- i.e., complaints do not go to STDERR.
|
|
|
|
=item C<< $parser->bare_output( I<SOMEVALUE> ) >>
|
|
|
|
Some formatter subclasses use this as a flag for whether output should
|
|
have prologue and epilogue code omitted. For example, setting this to
|
|
true for an HTML formatter class should omit the
|
|
"<html><head><title>...</title><body>..." prologue and the
|
|
"</body></html>" epilogue.
|
|
|
|
If you want to set this to true, you should probably also set
|
|
C<no_whining> or at least C<no_errata_section> to true.
|
|
|
|
=item C<< $parser->preserve_whitespace( I<SOMEVALUE> ) >>
|
|
|
|
If you set this attribute to a true value, the parser will try to
|
|
preserve whitespace in the output. This means that such formatting
|
|
conventions as two spaces after periods will be preserved by the parser.
|
|
This is primarily useful for output formats that treat whitespace as
|
|
significant (such as text or *roff, but not HTML).
|
|
|
|
=item C<< $parser->parse_empty_lists( I<SOMEVALUE> ) >>
|
|
|
|
If this attribute is set to true, the parser will not ignore empty
|
|
C<=over>/C<=back> blocks. The type of C<=over> will be I<empty>, documented
|
|
above, L<events with an element_name of over-empty>.
|
|
|
|
=back
|
|
|
|
=head1 SEE ALSO
|
|
|
|
L<Pod::Simple> -- event-based Pod-parsing framework
|
|
|
|
L<Pod::Simple::Methody> -- like Pod::Simple, but each sort of event
|
|
calls its own method (like C<start_head3>)
|
|
|
|
L<Pod::Simple::PullParser> -- a Pod-parsing framework like Pod::Simple,
|
|
but with a token-stream interface
|
|
|
|
L<Pod::Simple::SimpleTree> -- a Pod-parsing framework like Pod::Simple,
|
|
but with a tree interface
|
|
|
|
L<Pod::Simple::Checker> -- a simple Pod::Simple subclass that reads
|
|
documents, and then makes a plaintext report of any errors found in the
|
|
document
|
|
|
|
L<Pod::Simple::DumpAsXML> -- for dumping Pod documents as tidily
|
|
indented XML, showing each event on its own line
|
|
|
|
L<Pod::Simple::XMLOutStream> -- dumps a Pod document as XML (without
|
|
introducing extra whitespace as Pod::Simple::DumpAsXML does).
|
|
|
|
L<Pod::Simple::DumpAsText> -- for dumping Pod documents as tidily
|
|
indented text, showing each event on its own line
|
|
|
|
L<Pod::Simple::LinkSection> -- class for objects representing the values
|
|
of the TODO and TODO attributes of LE<lt>...E<gt> elements
|
|
|
|
L<Pod::Escapes> -- the module that Pod::Simple uses for evaluating
|
|
EE<lt>...E<gt> content
|
|
|
|
L<Pod::Simple::Text> -- a simple plaintext formatter for Pod
|
|
|
|
L<Pod::Simple::TextContent> -- like Pod::Simple::Text, but
|
|
makes no effort for indent or wrap the text being formatted
|
|
|
|
L<Pod::Simple::HTML> -- a simple HTML formatter for Pod
|
|
|
|
L<perlpod|perlpod>
|
|
|
|
L<perlpodspec|perlpodspec>
|
|
|
|
L<perldoc>
|
|
|
|
=head1 SUPPORT
|
|
|
|
Questions or discussion about POD and Pod::Simple should be sent to the
|
|
pod-people@perl.org mail list. Send an empty email to
|
|
pod-people-subscribe@perl.org to subscribe.
|
|
|
|
This module is managed in an open GitHub repository,
|
|
L<https://github.com/perl-pod/pod-simple/>. Feel free to fork and contribute, or
|
|
to clone L<git://github.com/perl-pod/pod-simple.git> and send patches!
|
|
|
|
Patches against Pod::Simple are welcome. Please send bug reports to
|
|
<bug-pod-simple@rt.cpan.org>.
|
|
|
|
=head1 COPYRIGHT AND DISCLAIMERS
|
|
|
|
Copyright (c) 2002 Sean M. Burke.
|
|
|
|
This library is free software; you can redistribute it and/or modify it
|
|
under the same terms as Perl itself.
|
|
|
|
This program is distributed in the hope that it will be useful, but
|
|
without any warranty; without even the implied warranty of
|
|
merchantability or fitness for a particular purpose.
|
|
|
|
=head1 AUTHOR
|
|
|
|
Pod::Simple was created by Sean M. Burke <sburke@cpan.org>.
|
|
But don't bother him, he's retired.
|
|
|
|
Pod::Simple is maintained by:
|
|
|
|
=over
|
|
|
|
=item * Allison Randal C<allison@perl.org>
|
|
|
|
=item * Hans Dieter Pearcey C<hdp@cpan.org>
|
|
|
|
=item * David E. Wheeler C<dwheeler@cpan.org>
|
|
|
|
=back
|
|
|
|
=for notes
|
|
Hm, my old podchecker version (1.2) says:
|
|
*** WARNING: node 'http://search.cpan.org/' contains non-escaped | or / at line 38 in file Subclassing.pod
|
|
*** WARNING: node 'http://lists.perl.org/showlist.cgi?name=pod-people' contains non-escaped | or / at line 41 in file Subclassing.pod
|
|
Yes, L<...> is hard.
|
|
|
|
|
|
=cut
|