Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Hi danraftis, I'm interested in a generic solution solution to automate (instead of manually cut+paste) assembly of PDF files. Contact me at ycco@fgrazi.33mail.com if you are interested in working together as pilot user.


sort by: page size:

Our company pumps out tons of long-document PDFs that are fairly complex in terms of text, charts, tables and whatnot. We then take the docs, excel files, and charts into InDesign to create a nicely designed PDF to upload online.

I need to find a way to automate this workflow. Anyone out there experience this in their organization and want to work together on something?


I've actually never heard of that! I'll look into it right now, thanks!

I was looking for PDF solutions months ago, but if this works well, it could come in handy in the future.


never heard of that ;) Indeed, manual processing is an option if you are ok to add some code any time a user want a new PDF design. That's not a really scalable process if your number of PDFs is increasing. It's also not a viable option for less technical users

Can you recommend any command-line pdf tools? As a student I have to look at lots of pdfs and the gui tools for manipulation are very lacking.

so, I've been working on pdfequips.com for about a year now, and I built this tool based on a request from someone I met on reddit.

The idea of the project is to provide free pdf tools for merging PDFs, splitting, compressing, converting, rotating etc. then implement advanced features through a premium subscription.


Use DocAssemble... it has an API and can fill PDF templates.

Adobe Acrobat Pro, Foxit Phantom PDF. I'll concede I can't think of a file manager though.

I'm mostly use NAPS2 scanner app on Windows. It can scan optimize , split, join PDF.

It’s not entirely clear what kind of PDFs you are building, but I’d like to hear more.

I’ve done this for 13 years (Mostly Quark instead of InDesign, but the same tools and tricks).

Let me know if you want to work together.

rusty

[you can email, feepish @ google’s free email]


I wrote an automation that converts any PDF to a Notion doc, preserving - headings - formatting - equations (inline and block) - images - references

Hop on the waitlist! Will be deploying first to a small group of beta users from the waitlist and then full-self-serve in 2-4 weeks.


edit pdf tool is under development.

I work in an industry that marks up PDFs extensively (construction) and have wanted to build a bespoke PDF markup tool for a long time. In designing concrete floor slabs, we need to transfer information from our FEA software to a dwg. Currently a blank plan is printed, it’s marked up by hand, and then handed to a drafter to put back into the computer. It’s a very inefficient process. But all the SDKs are stupidly expensive (this one included). It’s really surprising there isn’t something open source, given the importance of PDF markup in many industries and businesses.

I have automation around the downloading of pdfs that interest me and the production of man pages to pdfs. Gradually I'm developing conventions around format and so on, but it's pretty idiosyncratic, as you can imagine.

I've started building this back in december just for fun and suddenly it turned into a product. It provides a simple way to work with PDF files via API as well as web interface. I know there are some other on the market, but I hope someone finds it useful.

Yes, there is an enormous interest in this kind of thing, not the least in larger organizations with tons of PDF documents in various forms.

Even though this would only cover a small part of the needs or use cases, it will still be hugely useful if it works well.


Here's a neat hack I made recently to do basic PDF editing directly in a browser—without having to upload anything to a server.

I was initially looking for a way to do simple PDF modification (extracting pages, merging, and adding page numbers). There are some good server-side tools for this (QPDF, PDFTk, PDFBox, iText, Hummus), but for better speed and privacy I really wanted a 100% client-side solution.

There are a few good JavaScript PDF libraries for reading and displaying PDFs (pdf.js) and creating PDFs from scratch (jsPDF, PDFKit), but I couldn't find any for editing existing PDFs. So, I did what any self-respecting hacker would do, and rolled my own. :-)

Actually, I found out that Mozilla's pdf.js solved half the problem, as it does an excellent job disassembling PDF files. So all I had to do was figure out a way to put them back together again.

The result is PDF Assembler, now available on GitHub and NPM. I also put together a demonstration site (https://www.pdfcircus.com) which shows some examples of what it can do. I know PDF Assembler still needs some tweaking, but I think the basic idea is sound, and so far I've been pretty happy with how it works.

Please take a look and let me know what you think.

Thanks!

PDF Assembler Links:

Demonstration Site - https://www.pdfcircus.com

GitHub - https://github.com/DevelopingMagic/pdfassembler

NPM - https://www.npmjs.com/package/pdfassembler


I have a bunch of scanned PDFs from an open data request I'm looking forward to trying when I'm home. My own solution with pytesser was pretty effective but required a ton of tweaking.

There is already such software it's called PDF.

In the past I've used ghostscript to merge PDFs, but it's not user friendly of course. I do like your idea of one site hosting these sort of FOSS utilities with nice wrappers.
next

Legal | privacy