Week 2 of my pairing tour started off with a day spent back on AXUS with Kristin, a software crafter. She used the day as an opportunity to tackle an issue that’s been going on for a while and hasn’t been fully solved despite a few attempts by others on the team: formatting PDFs.

In the travel industry, PDFs remain a popular way to share information, and it’s important they look good. The team wanted to make sure that images weren’t being split across multiple pages of PDFs, as well as make sure that the image’s header stick with the image and make sure the two elements couldn’t be split across two pages, with the header at the very bottom and its image at the very top of the next page.

Naturally, this task was far from simple. We spent most of the morning attempting to make it work with the page-break-before and page-break-after properties, setting them to both always and avoid in different places, but for some reason they just didn’t seem to take. The progress was a bit slow since every time we made a change we had to update the browser, generate the PDF and scroll to the bottom, where we had an example we’d managed to replicate, and see what it looked like. While we were looking at the code, we found a whole lot of page-break-before and page-break-after's on different classes, sometimes in places where they didn’t matter and sometimes put together in such a way that they cancelled each other out. We cleaned things up as best we could, and I was reminded how important it is to make helper/utility classes in my CSS for properties like this that I can reuse again and again, rather than sticking the same properties onto a ton of different classes in a bunch of different files. Doing it that way can make changing things later much more difficult than is necessary.

I started to look more into the properties as well as how they’re typically handled. When you’re looking at a page, even something really long, on your computer, you don’t see any page breaks. Those only come when you either print off what’s on your screen or do something else, like export it to a PDF. I read about a media query that applies styles just when the web page is being printed (@media print), and we tried applying it to our CSS classes. This didn’t fix the problem, but I wanted to see what would happen if we printed out our PDF. Sure enough, on our printed version, the images listened to our code and didn’t break across pages. So we had solved for actual printing, but the team knew that these PDFs are often created and then emailed, and possibly never printed off. We still had to figure it out for PDFs.

After lunch, I realized that we had to be missing something, and asked Kirstin to explain more about how the PDFs are generated. She helped us backtrack the process to a PDF generator library, and I spent a while reading its docs for anything that would help. Eventually I found it - a line that explained why we couldn’t get the header to stick to its image. It said that a page-break style was applied to something too long, like an div that ranged three pages, it simply wouldn’t do anything. This was exactly our problem. Because of the way the content of each PDF was dynamically generated, it wasn’t a simple matter of wrapping what we wanted in a div and giving the wrapper our styles. We had to wrap more than we wanted, and because it was too long, the PDF generator simply ignored our styles.

We didn’t have a fix, but we did know a lot more than when we started. We updated our related stories with this new information and went about trying to find a work-around. We created a new rule that would put a page break before every header, starting every new section on its own page. This solution isn’t perfect, since it created some odd white space when the content didn’t work out to completely fill its last page, but it did result in headers sticking with their images and images always being at the top of the page, meaning they never got split across two pages. Unfortunately, it’s a situation where we might have to go with a solution that doesn’t make everyone happy - we learned that this used to be the way the PDFs were styled, but people using them didn’t like the extra white space. So they got rid of them, only to start getting tickets about the images being split. Kristin, who understands the code that dynamically generates the content really well, knows that if she had a lot of time she could redo it to make it possible to isolate the parts we want and put wrapper classes around them, but then it becomes a matter of deciding if its worth it to the client to pay that much to fix something that might not be a huge problem - after all, it’s a styling issue, nothing to do with functionality. I really wish we had been able to come up with an elegant solution by the end of the day, but it looks like Kristin and the rest of the team will have to discuss it with the client to figure out if its worth investing more time to fix.

This seems to be a somewhat common situation with client work - you know you can make something work eventually - you might even really want to figure out the solution, like Kristin and I still do - but it may not be in the client’s best interest to pay you to do it.

Before today, I had never thought about how PDFs are created and styled. There’s clearly a lot that goes into it, and I had trouble finding resources that gave this information, possibly because both PDF and printing pages are becoming a thing of the past. Nevertheless, I’m glad I have a better handle on it now. In the future, if I ever get to help my team choose the resources for things like a PDF generator, I’m going to try to investigate the support it has around styling. I’m continuing to learn that crafters are always learning about new things that they probably never thought they’d have to study. It’s not about knowing everything but rather honing your skills at research and learning new things quickly.

Main takeaways of the day:

  • Make sure you spend time researching before diving into a problem. If I had started off by researching PDF generation more thoroughly than I did, I could have asked Kristin about how they are generated earlier on and not wasted time wondering why our page-break styles refused to work.
  • There’s not an elegant solution for every problem. Sometimes you have to find the best solution for the given constraints and accept that you’ve left the project better than you found it.