Hi danraftis, I'm interested in a generic solution solution to automate (instead of manually cut+paste) assembly of PDF files. Contact me at ycco@fgrazi.33mail.com if you are interested in working together as pilot user.
Our company pumps out tons of long-document PDFs that are fairly complex in terms of text, charts, tables and whatnot. We then take the docs, excel files, and charts into InDesign to create a nicely designed PDF to upload online.
I need to find a way to automate this workflow. Anyone out there experience this in their organization and want to work together on something?
never heard of that ;) Indeed, manual processing is an option if you are ok to add some code any time a user want a new PDF design. That's not a really scalable process if your number of PDFs is increasing. It's also not a viable option for less technical users
so, I've been working on pdfequips.com for about a year now, and I built this tool based on a request from someone I met on reddit.
The idea of the project is to provide free pdf tools for merging PDFs, splitting, compressing, converting, rotating etc. then implement advanced features through a premium subscription.
I work in an industry that marks up PDFs extensively (construction) and have wanted to build a bespoke PDF markup tool for a long time. In designing concrete floor slabs, we need to transfer information from our FEA software to a dwg. Currently a blank plan is printed, it’s marked up by hand, and then handed to a drafter to put back into the computer. It’s a very inefficient process. But all the SDKs are stupidly expensive (this one included). It’s really surprising there isn’t something open source, given the importance of PDF markup in many industries and businesses.
I have automation around the downloading of pdfs that interest me and the production of man pages to pdfs. Gradually I'm developing conventions around format and so on, but it's pretty idiosyncratic, as you can imagine.
I've started building this back in december just for fun and suddenly it turned into a product. It provides a simple way to work with PDF files via API as well as web interface. I know there are some other on the market, but I hope someone finds it useful.
Here's a neat hack I made recently to do basic PDF editing directly in a browser—without having to upload anything to a server.
I was initially looking for a way to do simple PDF modification (extracting pages, merging, and adding page numbers). There are some good server-side tools for this (QPDF, PDFTk, PDFBox, iText, Hummus), but for better speed and privacy I really wanted a 100% client-side solution.
There are a few good JavaScript PDF libraries for reading and displaying PDFs (pdf.js) and creating PDFs from scratch (jsPDF, PDFKit), but I couldn't find any for editing existing PDFs. So, I did what any self-respecting hacker would do, and rolled my own. :-)
Actually, I found out that Mozilla's pdf.js solved half the problem, as it does an excellent job disassembling PDF files. So all I had to do was figure out a way to put them back together again.
The result is PDF Assembler, now available on GitHub and NPM. I also put together a demonstration site (https://www.pdfcircus.com) which shows some examples of what it can do. I know PDF Assembler still needs some tweaking, but I think the basic idea is sound, and so far I've been pretty happy with how it works.
Please take a look and let me know what you think.
I have a bunch of scanned PDFs from an open data request I'm looking forward to trying when I'm home. My own solution with pytesser was pretty effective but required a ton of tweaking.
In the past I've used ghostscript to merge PDFs, but it's not user friendly of course. I do like your idea of one site hosting these sort of FOSS utilities with nice wrappers.
reply