Importing HTML into InDesign

Importing HTML into InDesign


What to Import?

We have recently been looking at finding a way of importing HTML files easily into InDesign. This certainly is not an easy task because HTML files have been created specifically with an online purpose while InDesign is really intended primarly for print output. For example images that may have a 72 dpi resolution in the <img> tags are not suitable for print. It is primarily text based elements in the HTML  file(including tables) that will be most useful for import.

Some of the Issues

One of the issues with attempting to convert from CSS is that it most often than not does not specify any values, but relies on set defaults. InDesign styles specify values for style attributes in the InDesign interface. The styles for tags can either be set directly or through classes. Any plug-in or script attempting to reproduce the styles would need to be able to reproduce the defaults and then see when they are modified in the CSS code. A much better solution might be to just create a set of styles in an InDesign template with a matching system to match InDesign styles to HTML tags and CSS classes. However it could also be really useful to have some generation functionality to create the Paragraph Styles in InDesign and specifically ensure that we actually always have a match between the tags and styles in the HTML and the InDesign document.

CSS Defaults for Text

This is an example of CSS and InDesign default matching for CSS text based attributes:


CSS PropertyDefault ValueInDesign PropertyInDesign Default Value
font-familysans-seriffontFamilyArial
font-size16pxfontSize12pt
colorblackfillColorBlack
line-heightnormalleadingAuto
text-alignleftjustificationLeftAlign
text-transformnonecapitalizationNormal
font-weightnormalfontStyleRegular
font-stylenormalfontStyleRegular
text-decorationnoneunderlineNone
letter-spacingnormalkerningAuto


Typically the CSS properties will have the following values and these will vary in the ease in which they could be recreated in InDesign:

font-familyAny font name (e.g. “Arial”, “Helvetica”, “Times New Roman”) or a list of font names separated by commas (e.g. “Arial, sans-serif”). The browser will use the first available font in the list. The default value is “sans-serif”.
font-sizeAny valid CSS length value (e.g. 10px, 12pt, 2em). The default value is 16px.
colorAny valid CSS color value (e.g. red, #ff0000, rgb(255, 0, 0)). The default value is black.
line-heightAny valid CSS length value or a percentage of the font size (e.g. 1.5, 150%). The default value is normal, which is equivalent to a line height of 1.2 in most browsers.
text-alignleft, right, center, justify. The default value is left.
text-transformnone, uppercase, lowercase, capitalize. The default value is none.
font-weightnormal, bold, bolder, lighter, or a number between 100 and 900 in increments of 100 (e.g. 400, 700). The default value is normal.
font-stylenormal, italic, oblique. The default value is normal.
text-decorationnone, underline, overline, line-through. The default value is none.
letter-spacingAny valid CSS length value (e.g. 1px, 2pt, 0.1em). The default value is normal, which is equivalent to a letter spacing of 0 in most browsers.


However another issue with HTML is that the tags themselves will effectively have default styles

For example this might be what we would expect as the defaults for the <h1> tag:

<h1 style=”display: block; font-size: 2em; font-weight: bold; margin-top: 0; margin-bottom: 0.67em; text-align: left;”>This is an h1 element</h1>

This could then be overriden (potentially more than once) and also overriden by one or more applied classes in the HTML. For example

<h1 class=”heading1″>

The quality of the conversion to InDesign likely depends on how well the original HTML design was implemented..

The Best Solution

The easiest solution is probably just to provide a substitution table for tags and classes and not worry about pulling the attributes out of the html at all. It would however be useful to have some kind of report to understand what tags are present in the HTML to be imported.

An Existing Solution

Without developing a plug-in at the moment there is a solution to at least get the text in from an HTML page using the Import XML feature of InDesign. However for this to work the <head> tag section needs to be removed from the html file.
Share the Post:

Related Posts

Join Our Newsletter

Importing HTML into InDesign

Share the Post:

Related Posts

Join Our Newsletter