Word-wrap not working in Visualforce PDF

Not too long ago I was asked to support with some PDF rendering issues in one of our projects. In particular we were facing two common problems - word-wrap not working in Visualforce PDFs, and the loss of styling when using “Arial Unicode MS” font family.

I plan to write two blog posts about these problems with the solutions we implemented to get around them. This first post will be about the issue with word-wrap: break-word not playing nice.

Before and After photos

In order to demonstrate the problem and the solution, I’ve created a 4 column HTML table with some very long words inside of it. Here is how the table looks like when it is rendered in a Visualforce page as a PDF:

And here is how the same table looks like inside the same Visualforce page after some Apex magic:

Why is this happening?

The main problem we are facing is that Salesforce Visulaforce PDFs are using a very old PDF rendering engine called Flying Saucer which has a very limited support for CSS 3 features and the word-wrap is not one of them.

I learned this by reading through questions and answers in this StackExchange post when I first encountered this issue. On that same StackExchange post I’ve read the answer from Ralph Callaway that I used as a starting point for the solution I will present here.

How to solve it?

The basic concept of the solution is that the long words must be broken down so that they are automatically wrapped by the PDF rendering engine. The best way to split them is to wrap them with  tags so that a word such as SomethingVeryLong is represented as SomethingVeryLong.

The Visualforce page rendering engine can then use the span elements to split the word if it needs to, and the best part is, if it does not have to split it, the PDF will render it normally.

The Code

The below code snippet contains the complete solution for breaking the long words into individual span elements:

/**
 * Maximum word lenght
 */
final static Integer LENGTH = 7;

/**
 * Reduce the long strings (i.e. words) inside the provided text into more manageable pieces.
 * NOTE: the total length of the final string might get significantly larger due to this.
 *
 * @param text  String to reduce
 *
 * @return      Same string as the one passed in, but each individual long string is
 *              split and surrounded with `<span></span>` tags.
 *              e.g.ABC-Something -> <span class="avoid-wrap"><span>ABC-</span><span>Somethi</span><span>ng</span></span>
 */
public static String stringReducer(String text) {
    if (String.isBlank(text)) {
        return '';
    }

    String retVal = text;

    for (String str : new Set<String>(stripHtml(text).unescapeHtml4().split(' '))) {
        if (str.length() > LENGTH) {
            String newString = splitWord(str, LENGTH);

            // split into 4 replace calls becuse we had an issue that the html properites were getting replaced by accident:
            // e.g. <a href='somelongurl'>somelongurl</a> <- only "somelongurl" in content should be replaced.
            retVal = retVal.replace(' ' + str + ' ', ' <span class="avoid-wrap">' + newString + '</span> ')
                .replace('>' + str + '<', '><span class="avoid-wrap">' + newString + '</span><')
                .replace('>' + str + ' ', '><span class="avoid-wrap">' + newString + '</span> ')
                .replace(' ' + str + '<', ' <span class="avoid-wrap">' + newString + '</span><');//"
        }
    }

    return retVal;
}

/**
 * Strip HTML tags from the provided string
 */
private static String stripHtml(String s) {
    if (String.isBlank(s)) {
        return '';
    }

    return s.replaceAll('<[^>]+>', ' ');
}

/**
 * Split the provided string at special characters, 
 * then at any position if the length still exceeds `maxLength`
 *
 * @param str       String to split
 * @param maxLength Max lenght of each individual part of the string
 *
 * @return          ABC-Something -> <span>ABC-</span><span>Somethi</span><span>ng</span>
 */
private static String splitWord(String str, final Integer maxLength) {
    String newString = '';

    for (String s2 : splitStringBySpecialCharacters(str, '-;:<>/\\\u00A0')) {
        if (s2.length() > maxLength) {
            // the string is still too long, so split it further without any character consideration
            for (String s3 : splitEqually(s2, maxLength)) {
                newString = wrapAndAppend(newString, s3);
            }
        } else {
            // the string is of acceptable length, so just wrap it in <span></span> and append it
            newString = wrapAndAppend(newString, s2);
        }
    }

    return newString;
}

/**
 * Split the string by special characters.
 * If the special character is not a whitespace, it will be appended to the end of the first part of the string
 * (e.g. "ABC-Something" -> "ABC-", "Something")
 *
 * @param str               string to split
 * @param splitCharacters   single string of characters by which to split the string
 *
 * @return                  splitted string
 */
private static List<String> splitStringBySpecialCharacters(String str, String splitCharacters) {
    List<String> parts = new List<String>();

    if (str != null) {
        Integer startIndex = 0;
        while (true) {
            Integer preIndex = str.subString(startIndex).indexOfAny(splitCharacters);
            Integer index = startIndex + preIndex;

            if (preIndex == -1) {
                parts.add(str.subString(startIndex));
                break;
            }

            String word = str.subString(startIndex, index);
            String nextChar = str.subString(index, index + 1);

            // Dashes and the likes should stick to the word occuring before it. Whitespace doesn't have to.
            if (nextChar.isWhiteSpace() || nextChar == '\u00A0') {
                parts.add(word);
                parts.add(nextChar);
            } else {
                parts.add(word + nextChar);
            }

            startIndex = index + 1;
        }
    }

    return parts;
}

/**
 * Split the string equally into list of strings of provided size.
 * The trailing (last) string will usually be smaller than the rest.
 *
 * @param str   string to split
 * @param size  maximum size of each part
 *
 * @return      splitted string
 */
private static List<String> splitEqually(String str, Integer size) {
    List<String> ret = new List<String>();

    for (Integer start = 0; start < str.length(); start += size) {
        ret.add(str.substring(start, Math.min(str.length(), start + size)));
    }
    return ret;
}

/**
 * Wrap `toAppend` string into `<span></span>` so that it can be wrapped automatically by PDF engine
 * and append it to the end of the provided `str`.
 *
 * @param str       String to append to and then return
 * @param toAppend  String to wrap and append
 *
 * @return          str<span>toAppend</span>
 */
private static String wrapAndAppend(String str, String toAppend) {
    if (String.isBlank(toAppend)) {
        return str;
    }

    // don't wrap &nbsp; in spans
    // as it will just increase the string size without any benefits
    // (&nbsp; will wrap on its own)
    if (toAppend == '\u00A0') {
        str += toAppend;
    } else {
        str += '<span>' + toAppend + '</span>';
    }

    return str;
}

The code will apply the avoid-wrap CSS class to the span elements that represent the complete word that was split. Because of that, it’s always good to include the following CSS in your Visualforce Page to tell it not to wrap that span unless it really needs to:

/* try to avoid wrapping words if possible, and instead wrap at whitespaces */ 
span.avoid-wrap {
  display:inline-block;
}
a>span.avoid-wrap {
  display:inline;
}

Additionally, the code will try to be “smart” when splitting the words and try to split them at special characters (e.g. -;:<>/\\\u00A0).

How to use the code?

Let’s say you have an Apex controller with a getter called getHtml() that returns some HTML back to the Visualforce page. All you need to do is pass that HTML code through the stringReducer() function before returning it:

public with sharing class PDFController {
    public string getHtml() {
        return stringReducer('pass your HTML here');
    }
}

For the sake of completenes, here is the Visualforce Page code that I used to render the PDFs from the images above:

<apex:page 
  controller="PDFController" 
  showHeader="false" 
  renderAs="pdf" 
  applyHtmlTag="false" 
  applyBodyTag="false">
    <html>

    <head>
        <style>
            @page {
                size: A4;
                margin-top: 19mm;
                margin-left: 16mm;
                margin-right: 16mm;
                margin-bottom: 18mm;
            }

            /* border for table */

            td,
            th {
                border: 1px solid #444;
                padding: .5em;
                word-wrap: break-word;
            }

            table[border="0"] td,
            table[border="0"] th {
                border: none;
                word-wrap: break-word;
            }

            tr {
                page-break-inside: avoid;
            }

            /* try to avoid wrapping words if possible, and instead wrap at whitespaces */

            span.avoid-wrap {
                display: inline-block;
            }

            a>span.avoid-wrap {
                display: inline;
            }
        </style>
    </head>

    <body>
        <apex:outputText value="{!html}" escape="false"></apex:outputText>
    </body>

    </html>
</apex:page>

And the sample HTML table:

<h2>HTML Table</h2>

<table>
	<tbody>
		<tr>
			<th colspan="1" rowspan="1">Column 1</th>
			<th colspan="1" rowspan="1">Column 2</th>
			<th colspan="1" rowspan="1">Column 3</th>
			<th colspan="1" rowspan="1">Column 4</th>
		</tr>
		<tr>
			<td colspan="1" rowspan="1">Some normal text</td>
			<td colspan="1" rowspan="1">Some normal text</td>
			<td colspan="1" rowspan="1">Some normal text</td>
			<td colspan="1" rowspan="1">Some normal text</td>
		</tr>
		<tr>
			<td colspan="1" rowspan="1">Someverylongtextiswrittenherewhatwillithappenwithit</td>
			<td colspan="1" rowspan="1">Someverylongtext iswrittenhere whatwillithappenwithit</td>
			<td colspan="1" rowspan="1">Someverylongtext-iswrittenhere-whatwillithappenwithit</td>
			<td colspan="1" rowspan="1">Some-long-text-with-dashes</td>
		</tr>
	</tbody>
</table>

Final remarks

The solution above requires you to pass the generated HTML back to the VF Page and render it unescaped. You should always be careful when doing this as it can result in XSS as reported by the PMD Code Analyzer.

Written on June 9, 2022