Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

> a quick emacs regexping, a bit of macrology, pasted it into a spreadsheet (libreoffice recognises it's tab or whatever separated) and bingo, you're off.

It's very likely that, if you knew AWK well, you could have completed this task faster.

> Believe me, I have a track record of delivering, but all my job agents do is match by skills. And that's what their clients want.

It's hard to find the few companies that actually prioritize results, but it's worth it.



sort by: page size:

> using AWK will almost always be faster than designing a schema and importing into a database first

I have never had to do that with unstructured data - ah wait, I lie! I pulled data off the covid wiki page, a quick emacs regexping, a bit of macrology, pasted it into a spreadsheet (libreoffice recognises it's tab or whatever separated) and bingo, you're off.

> Oh, and want to know a really good tool for getting data from a text file into a database? AWK!

For smallish amounts, I use emacs regexps. For data too large for emacs, I'd probably use python.

> and without fuss that can create value for your company, and those are things that can go on your resume.

That's not how it works any more. They look for skills. They should not, they should do as you say, but seems not any more.

> specific things you have accomplished becomes more important than the keywords on your resume

again, no. Believe me, I have a track record of delivering, but all my job agents do is match by skills. And that's what their clients want. It's depressing.

So I disagree but thanks anyway


> If you are competent, have a realistic estimate of your competence, this might not be a bad idea if what you need is a purely technical solution.

This.

Just _yesterday_ some xls-to-json library that has been used in the past was choking on an innocent-looking file. After fiddling with the thing for over an hour, including using LibreOffice to change the file format, I still could not get the library to read the file.

In five minutes I had a Python script which read the CSV version of the file and converted it to the necessary format. I chanced writing the script because that I knew that I could. In other situations, gambling company time on "I _might_ be able to help" usually leads to wasting company time.


>I've learned plenty of programming tools and I recoil in disgust at the sight of them

As do I.

>Spreadsheets are pretty much the best tool out there for exploratory work with numbers.

That's an ideal use of a spreadsheet. A non-ideal use would be, say, re-inventing the RDBMS in Excel (perhaps the most common one, but there are far worse sins).

>Personally, I'm in a process of learning Excel in depth

I would, but I don't have the time. As I've said: with the sort of data analysis I'm doing (very little), AWK works well enough.


> I'm trying out QuikTape now, but it had to clone 1.6GiB.

How far we’ve come from awk scripts that are about 1kiB (https://c2.com/doc/expense/)

> and good date math (`today + 3 business days`)

That’s tricky/impossible in international teams.


> Before writing the resume I set out to find the ideal tool to write it.

Um, any word processor? Even a text file? A good resume is only a page long. Spending time researching the ideal tool, buying it, installing it, worrying that it is malware, learning how to use it, seems like far more time than just typing it in.

Heck, my early resumes were written on a manual typewriter. Then they became text files sent to a daisy wheel printer, which made them look very nice.


> The best one was an employer's site that described itself as "Easy Apply"! You had to give it a resume, which it parsed, badly, and sprayed randomly into about a thousand text boxes. I thought maybe the problem was starting with a pdf, so I began again with a Word document. The results were exactly the same, suggesting they exported to pdf and used the same shitty parser.

Ohh, let me guess. Workday? There are a few application systems that offer this functionality but workday is _consistently_ the worst at parsing whatever I give it (text, markdown, html, pdf...).


> but that's easy to solve

That's handwaving over a lot more than just grepping a word file.


>I used text-to-columns to very nicely parse pipe-delimited files.

Sure, but you still can't edit it and save it without losing data. I use Excel nearly every day as well, and I get more and more frustrated the more I use it. My coworkers hate it as well. Just because you haven't personally experienced any problems doesn't mean they don't exist.


> when a prospective job applicant sends a resume in Pages, it’s a disqualifier

I'm utterly astounded that anyone sends a resume in anything other than PDF.

Sometimes employers require .doc or .docx, and that's a disqualifier. For me.


> Even though I have a Mac, when a prospective job applicant sends a resume in Pages, it’s a disqualifier, because they don’t understand that Windows users can’t read that.

As someone who can remember the 1990s, when .DOC was the standard file format for any kind of business and much academic work, I find this terribly amusing.


> I honestly wonder if there is actually going to be any cost savings, once training and the friction of converting existing work-flows over from MS Office to LibreOffice are taken into account.

Anecdotally, a frighteningly large part of the civil service's workflow is implemented in Excel/Word macros. If that's true, it won't matter how good LibreOffice is, migrating that to anything else is going to be the biggest pain (and cost) point.


> Making an interface for a few columns of a few hundred rows sounds like the classic NCIS hacker line "let me quickly code a GUI in Visual Basic to run that IP address check".

I can understand that, but keep in mind; I add a new line at the end of the spreadsheet for every new vocab word that is highlighted by the book.

I had to look over to make sure I got everything so the words need to be sorted. So I had to select everything in the menu -> two clicks, then click sort->another two clicks. For every few words...

It was much easier just to make a button into the spreadsheet that does this for me. One click.

It was also a perfect excuse to learn about macros. If you ask 'what's the point?' for every fun little technology you learn about-you will become depressed fast.


> It is very uncommon to automate MS-Word or Excel via COM these days. I am happy.

Yeah, but I have a few bazillion lines of .bas that I'm not afraid to dip into on occasion for purely internal reasons.


> Markdown will never get beyond developers.

My brother wrote his master thesis in LaTex. Being a non-techie (he's an archeologist) I had to do all the 'fixing'.

We switched to LaTex after both Word and OpenOffice Writer made working with the document a non-option.

I also used git to manage changes.

This was 2009.

He could not have done it without my help with this setup. But he could also not have done it without me in Word/Writer because these apps just behave 'funny' once you reach certain complexity in your document.

Fast forward to today.

My partner is writing her PhD thesis (she's an anthropologist and hates anything that has a keyboard).

Not in LaTex but in Markdown.

It lives in a GitHub repo. She is using VSCode with some extension that translates Markdown to LaTex and then to PDF so she can have a preview (inside VSCode no less) any time.

It took me half a day to research the options, settle on that one and to set it up on her laptop (she could have never done that still though).

I spent about two hours to teach her the workflows after.

Thanks to VSCode's built-in git support she manages that also from within the app. Including branches and merges.

After the initial two hours she never needed any help from me.

I think there is a business opportunity here.

Particular people in the humanities seem to dread writing stuff for one because modern word processors are just monstrous and instable as f*ck once you reach a certain document size/complexity.


> ...it doesn't break the flow because presentation comes later and logic is at the beginning of the dt part

I didn't mean to discredit the work done. It's a big undertaking in any case. The idea and the aim is good, however it breaks the conciseness and reduces the speed of implementation.

> with the aforementioned tools you have the logic and output mixed all over the place.

I think this is a secondary effect of composability, pipe and conciseness requirement.

> I will however still use AWK.

Me too, and this is why I made my prior comment, exactly.


> I rarely do it, but it’s nice to be able to Human read when I need to. Also being able to use all the command line text tools is super convenient.

Sometimes it helps a lot to eyeball what you have before doing one off scripts to filter/massage your data. Had a recent case where the path of least resistance was database to csv to one off python to statistic tools with a gui and tabular display.

Could have probably done some enterprise looking export infrastructure but it was a one off and not worth it.


> Never found a good solution to powerpoint

tpp or LibreOffice's Impress.

Using open formats diff better. There's also the homebrew method I use where I decompress the document and store that tree in git too, sometimes it helps and is easy to recompress into document format. It's part of my regular checkin script.

Long story short, some recruiters don't like pdf's (from LaTeX) so I've got a mixed setup of LaTeX and libreoffice. The LaTeX copy produces several different documents with varying amounts of personal information. This is toggled through variables in the script so some outputs are web-safe (nothing but email), some I sent to recruiters which have temporary phone numbers on which are disconnected when I'm FTE. The other has a real SIM mobile on that I use on company internal job boards.

LaTeX wins here, but to keep some recruiters happy I copy the changes into an odt format. Yes I know about Calibre.


>so I just export my results to latex format with pandas.

In other words, you don't do it by hand. I too rely on a tool to create my tables, which was my point. Doing it by hand is painful.


> every company/hiring startup is making tools to just feed your word docs

Whereas pdftotext tools have been there since forever.

I've written tools to try and extract text from docx files programmatically (not CVs but citations and footnote citations) and I seriously doubt "hiring startups" wouldn't just ask you to fill a web form.

(As a matter of fact, I'm low key job hunting and what I've seen typically is that the systems try to extract the text from my non-standard formatting latex-made pdf, and then pre-fill a form. The form is usually right but needs fixes here and there. It's a good approach.)

next

Legal | privacy