There is a good article titled “Converting a Word Doc and PDF into a Drupal Book for web display” about using the Drupal 5 version of this module to convert a 945 page PhD dissertation into a Drupal book. The Drupal 5 version is no longer supported, but the Drupal 6 and Drupal 7 versions works the same way and the general method for conversion is the same.
Before you start, make sure you've prepared your HTML for
conversion. Badly structured HTML will yield bad results. Make sure
the top level header in yiur document have the tag
<h1>
, and that all header levels are properly
nested (having <h3>
tags follow
an <h1>
tag without an
intermediate <h2>
tag are not properly nested).
Clean and simple HTML will work the best. For best results, clean up HTML by hand. If you're unable to do this, you may try to run your an online version of HTML Tidy, or to use the core Html Corrector text filter and/or the HTML Tidy module to clean up the HTML.
If the source is Microsoft Word, the text should be saved as “Web page, Filtered” or “Filtered HTML”. When setting up the HTML Tidy module, choose the option to clean up Microsoft Word text.
To not include any materials from the HTML
<head>
. Delete the everything from the top of the
HTML-file, inluding the <body>
tag.
To convert HTML into a book, you create content of type “Book page”, paste the book text into the “Body” field, and select the options for what headers to split on “HTML2Book Splitter” field group, as illustrated below:
All text before the first HTML heading tag will be retained as the body of the original (book root) page.
Each new book page will have the same author, categories, settings, and other characteristics of the original page.
Subsequent pages will be added as children of book root page, using the heading as their title and all text from that point to the next heading as their body. Child pages will be nested based on the subheading number, provided the subheadings are logically organized.
You can choose which heading levels will be used to create new
pages. You may, for example, create new pages only when
a <h1>
heading is encountered, or make a new page
at every header.
If Organic Groups is used and the original page has been assigned to one or more groups, all child book pages will belong to the same groups.