class Ronn::Document
The Document
class can be used to load and inspect a ronn document and to convert a ronn document into other formats, like roff or HTML.
Ronn
files may optionally follow the naming convention: “<name>.<section>.ronn”. The <name> and <section> are used in generated documentation unless overridden by the information extracted from the document's name section.
Attributes
The raw input data, read from path or stream and unmodified.
The date the document was published; center displayed in the document footer.
Encoding that the Ronn
document is in
The index used to resolve man and file references.
The manual this document belongs to; center displayed in the header.
The man pages name: usually a single word name of a program or filename; displayed along with the section in the left and right portions of the header as well as the bottom right section of the footer.
The name of the group, organization, or individual responsible for this document; displayed in the left portion of the footer.
Output directory to write files to.
Path to the Ronn
document. This may be '-' or nil when the Ronn::Document
object is created with a stream, in which case stdin will be read.
The man page's section: a string whose first character is numeric; displayed in parenthesis along with the name.
Array of style modules to apply to the document.
Single sentence description of the thing being described by this man page; displayed in the NAME section.
Public Class Methods
Create a Ronn::Document
given a path or with the data returned by calling the block. The document is loaded and preprocessed before the intialize method returns. The attributes hash may contain values for any writeable attributes defined on this class.
# File lib/ronn/document.rb 71 def initialize(path = nil, attributes = {}, &block) 72 @path = path 73 @basename = path.to_s =~ /^-?$/ ? nil : File.basename(path) 74 @reader = block || 75 lambda do |f| 76 if ['-', nil].include?(f) 77 STDIN.read 78 else 79 File.read(f, encoding: @encoding) 80 end 81 end 82 @data = @reader.call(path) 83 @name, @section, @tagline = sniff 84 85 @styles = %w[man] 86 @manual, @organization, @date = nil 87 @markdown, @input_html, @html = nil 88 @index = Ronn::Index[path || '.'] 89 @index.add_manual(self) if path && name 90 91 attributes.each { |attr_name, value| send("#{attr_name}=", value) } 92 end
Public Instance Methods
Generate a file basename of the form “<name>.<section>.<type>” for the given file extension. Uses the name and section from the source file path but falls back on the name and section defined in the document.
# File lib/ronn/document.rb 98 def basename(type = nil) 99 type = nil if ['', 'roff'].include?(type.to_s) 100 [path_name || @name, path_section || @section, type] 101 .compact.join('.') 102 end
Convert the document to :roff, :html, or :html_fragment and return the result as a string.
# File lib/ronn/document.rb 240 def convert(format) 241 send "to_#{format}" 242 end
The date the man page was published. If not set explicitly, this is the file's modified time or, if no file is given, the current time. Center displayed in the document footer.
# File lib/ronn/document.rb 184 def date 185 return @date if @date 186 187 return File.mtime(path) if File.exist?(path) 188 189 Time.now 190 end
A Nokogiri DocumentFragment for the manual content fragment.
# File lib/ronn/document.rb 234 def html 235 @html ||= process_html! 236 end
Preprocessed markdown input text.
# File lib/ronn/document.rb 229 def markdown 230 @markdown ||= process_markdown! 231 end
Returns the manual page name based first on the document's contents and then on the path name. Usually a single word name of a program or filename; displayed along with the section in the left and right portions of the header as well as the bottom right section of the footer.
# File lib/ronn/document.rb 140 def name 141 @name || path_name 142 end
Truthful when the name was extracted from the name section of the document.
# File lib/ronn/document.rb 146 def name? 147 !@name.nil? 148 end
Construct a path for a file near the source file. Uses the Document#basename
method to generate the basename part and appends it to the dirname of the source document.
# File lib/ronn/document.rb 107 def path_for(type = nil) 108 if @outdir 109 File.join(@outdir, basename(type)) 110 elsif @basename 111 File.join(File.dirname(path), basename(type)) 112 else 113 basename(type) 114 end 115 end
Returns the <name> part of the path, or nil when no path is available. This is used as the manual page name when the file contents do not include a name section.
# File lib/ronn/document.rb 120 def path_name 121 return unless @basename 122 123 parts = @basename.split('.') 124 parts.pop if parts.length > 1 && parts.last =~ /^\w+$/ 125 parts.pop if parts.last =~ /^\d+$/ 126 parts.join('.') 127 end
Returns the <section> part of the path, or nil when no path is available.
# File lib/ronn/document.rb 131 def path_section 132 $1 if @basename.to_s =~ /\.(\d\w*)\./ 133 end
The name used to reference this manual.
# File lib/ronn/document.rb 164 def reference_name 165 name + (section && "(#{section})").to_s 166 end
Returns the manual page section based first on the document's contents and then on the path name. A string whose first character is numeric; displayed in parenthesis along with the name.
# File lib/ronn/document.rb 153 def section 154 @section || path_section 155 end
True when the section number was extracted from the name section of the document.
# File lib/ronn/document.rb 159 def section? 160 !@section.nil? 161 end
Sniff the document header and extract basic document metadata. Return a tuple of the form: [name, section, description], where missing information is represented by nil and any element may be missing.
# File lib/ronn/document.rb 210 def sniff 211 html = Kramdown::Document.new(data[0, 512], auto_ids: false, smart_quotes: ['apos', 'apos', 'quot', 'quot'], typographic_symbols: { hellip: '...', ndash: '--', mdash: '--' }).to_html 212 heading, html = html.split("</h1>\n", 2) 213 return [nil, nil, nil] if html.nil? 214 215 case heading 216 when /([\w_.\[\]~+=@:-]+)\s*\((\d\w*)\)\s*-+\s*(.*)/ 217 # name(section) -- description 218 [$1, $2, $3] 219 when /([\w_.\[\]~+=@:-]+)\s+-+\s+(.*)/ 220 # name -- description 221 [$1, nil, $2] 222 else 223 # description 224 [nil, nil, heading.sub('<h1>', '')] 225 end 226 end
Styles to insert in the generated HTML output. This is a simple Array of string module names or file paths.
# File lib/ronn/document.rb 203 def styles=(styles) 204 @styles = (%w[man] + styles).uniq 205 end
The document's title when no name section was defined. When a name section exists, this value is nil.
# File lib/ronn/document.rb 177 def title 178 @tagline unless name? 179 end
Truthful when the document started with an h1 but did not follow the “<name>(<sect>) – <tagline>” convention. We assume this is some kind of custom title.
# File lib/ronn/document.rb 171 def title? 172 !name? && tagline 173 end
# File lib/ronn/document.rb 285 def to_h 286 %w[name section tagline manual organization date styles toc] 287 .each_with_object({}) { |name, hash| hash[name] = send(name) } 288 end
Convert the document to HTML and return the result as a string. The returned string is a complete HTML document.
# File lib/ronn/document.rb 255 def to_html 256 layout = ENV['RONN_LAYOUT'] 257 layout_path = nil 258 if layout 259 layout_path = File.expand_path(layout) 260 unless File.exist?(layout_path) 261 warn "warn: can't find #{layout}, using default layout." 262 layout_path = nil 263 end 264 end 265 266 template = Ronn::Template.new(self) 267 template.context.push html: to_html_fragment(nil) 268 template.render(layout_path || 'default') 269 end
Convert the document to HTML and return the result as a string. The HTML does not include <html>, <head>, or <style> tags.
# File lib/ronn/document.rb 274 def to_html_fragment(wrap_class = 'mp') 275 frag_nodes = html.at('body').children 276 out = frag_nodes.to_s.rstrip 277 out = "<div class='#{wrap_class}'>#{out}\n</div>" unless wrap_class.nil? 278 out 279 end
# File lib/ronn/document.rb 295 def to_json(*_args) 296 require 'json' 297 to_h.merge('date' => date.iso8601).to_json 298 end
# File lib/ronn/document.rb 281 def to_markdown 282 markdown 283 end
Convert the document to roff and return the result as a string.
# File lib/ronn/document.rb 245 def to_roff 246 RoffFilter.new( 247 to_html_fragment(nil), 248 name, section, tagline, 249 manual, organization, date 250 ).to_s 251 end
# File lib/ronn/document.rb 290 def to_yaml 291 require 'yaml' 292 to_h.to_yaml 293 end
Retrieve a list of top-level section headings in the document and return as an array of +[id, text]+ tuples, where id
is the element's generated id and text
is the inner text of the heading element.
# File lib/ronn/document.rb 195 def toc 196 @toc ||= 197 html.search('h2[@id]').map { |h2| [h2.attributes['id'].content.upcase, h2.inner_text] } 198 end
Protected Instance Methods
HTMLize the manual page reference. The result is an <a> if the page appears in the index, otherwise it is a <span>. The first argument may be an HTML element or a string. The second should be a string of the form “(#{section})”.
# File lib/ronn/document.rb 504 def html_build_manual_reference_link(name_or_node, section) 505 name = if name_or_node.respond_to?(:inner_text) 506 name_or_node.inner_text 507 else 508 name_or_node 509 end 510 ref = index["#{name}#{section}"] 511 if ref 512 "<a class='man-ref' href='#{ref.url}'>#{name_or_node}<span class='s'>#{section}</span></a>" 513 else 514 # warn "warn: manual reference not defined: '#{name}#{section}'" 515 "<span class='man-ref'>#{name_or_node}<span class='s'>#{section}</span></span>" 516 end 517 end
Perform angle quote (<THESE>) post filtering.
# File lib/ronn/document.rb 381 def html_filter_angle_quotes 382 # convert all angle quote vars nested in code blocks 383 # back to the original text 384 code_nodes = @html.search('code') 385 code_nodes.search('.//text() | text()').each do |node| 386 next unless node.to_html.include?('var>') 387 388 new = 389 node.to_html 390 .gsub('<var>', '<') 391 .gsub('</var>', '>') 392 node.swap(new) 393 end 394 end
Add a 'data-bare-link' attribute to hyperlinks whose text labels are the same as their href URLs.
# File lib/ronn/document.rb 453 def html_filter_annotate_bare_links 454 @html.search('a[@href]').each do |node| 455 href = node.attributes['href'].content 456 text = node.inner_text 457 458 next unless href == text || href[0] == '#' || 459 CGI.unescapeHTML(href) == "mailto:#{CGI.unescapeHTML(text)}" 460 461 node.set_attribute('data-bare-link', 'true') 462 end 463 end
Convert special format unordered lists to definition lists.
# File lib/ronn/document.rb 397 def html_filter_definition_lists 398 # process all unordered lists depth-first 399 @html.search('ul').to_a.reverse_each do |ul| 400 items = ul.search('li') 401 next if items.any? { |item| item.inner_text.strip.split("\n", 2).first !~ /:$/ } 402 403 dl = Nokogiri::XML::Node.new 'dl', html 404 items.each do |item| 405 # This processing is specific to how Markdown generates definition lists 406 term, definition = item.inner_html.strip.split(":\n", 2) 407 term = term.sub(/^<p>/, '') 408 409 dt = Nokogiri::XML::Node.new 'dt', html 410 dt.children = Nokogiri::HTML.fragment(term) 411 dt.attributes['class'] = 'flush' if dt.inner_text.length <= 7 412 413 dd = Nokogiri::XML::Node.new 'dd', html 414 dd_contents = Nokogiri::HTML.fragment(definition) 415 dd.children = dd_contents 416 417 dl.add_child(dt) 418 dl.add_child(dd) 419 end 420 ul.replace(dl) 421 end 422 end
Add URL anchors to all HTML heading elements.
# File lib/ronn/document.rb 444 def html_filter_heading_anchors 445 h_nodes = @html.search('//*[self::h1 or self::h2 or self::h3 or self::h4 or self::h5 and not(@id)]') 446 h_nodes.each do |heading| 447 heading.set_attribute('id', heading.inner_text.gsub(/\W+/, '-')) 448 end 449 end
# File lib/ronn/document.rb 424 def html_filter_inject_name_section 425 markup = 426 if title? 427 "<h1>#{title}</h1>" 428 elsif name 429 "<h2>NAME</h2>\n" \ 430 "<p class='man-name'>\n <code>#{name}</code>" + 431 (tagline ? " - <span class='man-whatis'>#{tagline}</span>\n" : "\n") + 432 "</p>\n" 433 end 434 return unless markup 435 436 if html.at('body').first_element_child 437 html.at('body').first_element_child.before(Nokogiri::HTML.fragment(markup)) 438 else 439 html.at('body').add_child(Nokogiri::HTML.fragment(markup)) 440 end 441 end
Convert text of the form “name(section)” or “name
(section) to a hyperlink. The URL is obtained from the index.
# File lib/ronn/document.rb 467 def html_filter_manual_reference_links 468 return if index.nil? 469 470 name_pattern = '[0-9A-Za-z_:.+=@~-]+' 471 472 # Convert "name(section)" by traversing text nodes searching for 473 # text that fits the pattern. This is the original implementation. 474 @html.search('.//text() | text()').each do |node| 475 next unless node.content.include?(')') 476 next if %w[pre code h1 h2 h3].include?(node.parent.name) 477 next if child_of?(node, 'a') 478 node.swap(node.content.gsub(/(#{name_pattern})(\(\d+\w*\))/) do 479 html_build_manual_reference_link($1, $2) 480 end) 481 end 482 483 # Convert "<code>name</code>(section)" by traversing <code> nodes. 484 # For each one that contains exactly an acceptable manual page name, 485 # the next sibling is checked and must be a text node beginning 486 # with a valid section in parentheses. 487 @html.search('code').each do |node| 488 next if %w[pre code h1 h2 h3].include?(node.parent.name) 489 next if child_of?(node, 'a') 490 next unless node.inner_text =~ /^#{name_pattern}$/ 491 sibling = node.next 492 next unless sibling 493 next unless sibling.text? 494 next unless sibling.content =~ /^\((\d+\w*)\)/ 495 node.swap(html_build_manual_reference_link(node, "(#{$1})")) 496 sibling.content = sibling.content.gsub(/^\(\d+\w*\)/, '') 497 end 498 end
# File lib/ronn/document.rb 312 def input_html 313 @input_html ||= strip_heading(Kramdown::Document.new(markdown, auto_ids: false, smart_quotes: ['apos', 'apos', 'quot', 'quot'], typographic_symbols: { hellip: '...', ndash: '--', mdash: '--' }).to_html) 314 end
Convert <WORD> to <var>WORD</var> but only if WORD isn't an HTML tag.
# File lib/ronn/document.rb 367 def markdown_filter_angle_quotes(markdown) 368 markdown.gsub(/<([^:.\/]+?)>/) do |match| 369 contents = $1 370 tag, attrs = contents.split(' ', 2) 371 if attrs =~ /\/=/ || html_element?(tag.sub(/^\//, '')) || 372 data.include?("</#{tag}>") || contents =~ /^!/ 373 match.to_s 374 else 375 "<var>#{contents}</var>" 376 end 377 end 378 end
Add [id]: #ANCHOR elements to the markdown source text for all sections. This lets us use the [SECTION-REF][] syntax
# File lib/ronn/document.rb 354 def markdown_filter_heading_anchors(markdown) 355 first = true 356 markdown.split("\n").grep(/^[#]{2,5} +[\w '-]+[# ]*$/).each do |line| 357 markdown << "\n\n" if first 358 first = false 359 title = line.gsub(/[^\w -]/, '').strip 360 anchor = title.gsub(/\W+/, '-').gsub(/(^-+|-+$)/, '') 361 markdown << "[#{title}]: ##{anchor} \"#{title}\"\n" 362 end 363 markdown 364 end
Appends all index links to the end of the document as Markdown reference links. This lets us use [foo(3)][] syntax to link to index entries.
# File lib/ronn/document.rb 344 def markdown_filter_link_index(markdown) 345 return markdown if index.nil? || index.empty? 346 347 markdown << "\n\n" 348 index.each { |ref| markdown << "[#{ref.name}]: #{ref.url}\n" } 349 markdown 350 end
Parse the document and extract the name, section, and tagline from its contents. This is called while the object is being initialized.
# File lib/ronn/document.rb 307 def preprocess! 308 input_html 309 nil 310 end
# File lib/ronn/document.rb 327 def process_html! 328 wrapped_html = "<html>\n <body>\n#{input_html}\n </body>\n</html>" 329 @html = Nokogiri::HTML.parse(wrapped_html) 330 html_filter_angle_quotes 331 html_filter_definition_lists 332 html_filter_inject_name_section 333 html_filter_heading_anchors 334 html_filter_annotate_bare_links 335 html_filter_manual_reference_links 336 @html 337 end
# File lib/ronn/document.rb 321 def process_markdown! 322 md = markdown_filter_heading_anchors(data) 323 md = markdown_filter_link_index(md) 324 markdown_filter_angle_quotes(md) 325 end
# File lib/ronn/document.rb 316 def strip_heading(html) 317 heading, html = html.split("</h1>\n", 2) 318 html || heading 319 end