Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

I have a dream... that one day people will name their projects with names that don't exist on google yet.

If you search for PEP now you'll find python enhancement proposals, and the "Philippine Entertainment Portal" and the stock code for PepsiCo.



view as:

I wish that once people do name their project, they would assign it a 128-bit random number in lower case hex, and include that number on any web page that they would like people searching for their project to find.

That way once I know that say PEP the PDF editor exists and find its 128-bit number (let's say that is 379dd864b16eaca3ce94c15a6bdfcc73), at least I can subsequently toss a +379dd864b16eaca3ce94c15a6bdfcc73 on my searches to effectively let the search engine know I want PEP the PDF editor results rather than PEP the python enhancement results or PEP the entertainment portal results or PEP that refreshing beverage company stock symbol.

"xxd -l 16 -p /dev/urandom" is a handy way to get a 128-bit random hex number. A UUID generator works, too, although they usually include some punctuation you will need to delete and you might have to lower case their output.


> I wish that once people do name their project, they would assign it a 128-bit random number in lower case hex

We already have something similar: URLs.


Except a URL only points to one resource. The idea here is that this identifier would exist on any resource related to PEP (maybe even in URLs).

That’s not a problem, you just add a meta or a link tag that points to “the” url for your project (maybe og:app-id Or a link with rel=“app”)

>Except a URL only points to one resource

isn't that exactly what this is asking for though? A URL can by definition only point to one resource. So if you include that URL with every other reference to the project (in the app descriptions, blog posts about it, etc) then you always know you're talking about the same thing. It makes a lot more sense that any resource related to this PDF editor should include a link to "https://macpep.org" instead of including some random 128 character string. Any resource related to python peps should include a link to "https://www.python.org/dev/peps/" (which all PEPs do, by virtue of having a url that's a subdirectory of the PEP index URL)


More in line with using "golang" since it is far easier to search for than just "go"

Yeah but urls and names might need to change due to marketing. A hash would uniquely id the project and let the marketing aspect be dynamic.

Although if the actual project name, authors, and codebase changes is it even still the same project?


Then maybe what really needs to happen is we burn all marketers in a pyre? So they quit fucking up things that ain’t fucked up?

There's a number of professions I'd like to add to that pyre please. ;)

tzs is proposing URNs rather than textual program names. A URL is unnecessarily specific (though I suppose you could anycast URL resolution)

> RFC 4122 defines a Uniform Resource Name (URN) namespace for UUIDs. A UUID presented as a URN appears as follows:[1]

> > urn:uuid:123e4567-e89b-12d3-a456-426655440000

https://en.wikipedia.org/wiki/Universally_unique_identifier#...

Version 4 UUIDs have 122 random bits (out of 128 bits total).

In Python:

  >>> import uuid
  >>> _id = uuid.uuid4()
  >>> _id.urn
  'urn:uuid:4c466878-a81b-4f22-a112-c704655fa4ee'
Whether search engines will consider a URL or a URN or a random str without dashes to be one searchable-for token is pretty ironic in terms of extracting relations between resources in a Linked Data hypergraph.

  >>> _id.hex
  '4c466878a81b4f22a112c704655fa4ee'
The relation between a resource and a Thing with a URI/URN/URL can be expressed with https://schema.org/about . In JSON-LD ("JSONLD"):

  {"@context": "https://schema.org",
   "@type": "WebPage",
   "about": {
     "@type": "SoftwareApplication",
     "identifier": "urn:uuid:4c466878-a81b-4f22-a112-c704655fa4ee",
     "url": ["", ""],
     "name": [
       "a schema.org/SoftwareApplication < CreativeWork < Thing",
       {"@value": "a rose by any other name",
        "@language": "en"}]}}
Or with RDFa:

  <body vocab="https://schema.org/" typeof="WebPage">
    <div property="about" typeof="SoftwareApplication">
      <meta property="identifier" content="urn:uuid:4c466878-a81b-4f22-a112-c704655fa4ee"/>
      <a property="url" href=""></a>
      <a property="url" href=""></a>
      <span property="name">a schema.org/SoftwareApplication &lt; CreativeWork &lt; Thing</span>
      <span property="name" lang="en">a rose by any other name</span>
    </div>
  </body>
Or with Microdata:

  <div itemtype="https://schema.org/WebPage" itemscope>
    <link itemprop="http://www.w3.org/ns/rdfa#usesVocabulary" href="https://schema.org/" />
    <div itemprop="about" itemtype="https://schema.org/SoftwareApplication" itemscope>
      <a itemprop="url" href=""></a>
      <a itemprop="url" href=""></a>
      <meta itemprop="identifier" content="urn:uuid:4c466878-a81b-4f22-a112-c704655fa4ee" />
      <meta itemprop="name" content="a schema.org/SoftwareApplication &lt; CreativeWork &lt; Thing"/>
      <meta itemprop="name" content="a rose by any other name" lang="en"/>
    </div>
  </div>

Instead of trolling HN you could contribute to Wikidata.

8 bit ascii could work.

8^256 is a huge number.


I can’t even this math.

I worded wrong.

A 256 bit number is 2^256 possible combinations. 8^256 is the same as (2^3)^256 (or 2^768)

Er yeah. Derp.

8 character ascii not 8 bit.

It's early.

That's 8 bits ^ 8.

Or 256 ^ 8.

and easily able to be represented searching online with 8 characters.


Don't get discouraged, but you might still have a bug or two to work out with your new and never-before-tried "256^8 and easily able to be represented searching online with 8 characters" design. For your beta test, here are some of the unique 8-char identifiers you might want to try searching for: `unique `, ` unique `, ` unique`, `Unique `, `u^Hunique`, `un^Hnique`, `uniq^H^H^H^H`, ` . . . .`, `. . . . `, `uniqueESCESC`, `BELBELBELBELBELBELBELBEL`, ...

Lol, thanks for your totally non sarcastic assistance. It needs some fleshing out.

If the alternative is a 32 char, 128 bit hex string..I think that's a little excessive to expect people to use especially when an 8 char ascii has way more variation and is way easier to remember.

687c066db3458f7cbd5cc8bd58a65c64.

Vs.

*Xrh6x1!

You just have to eliminate dictionary words.


If you want something like that with binaries, Nix(OS) might be for you.

That's more or less what Ted Nelson envisioned in Xanadu and that's why he usually says that modern cut and paste has nothing to do with the real cut and paste and he consider it “a crime against humanity.”

Even though he's friend with Larry Tesler, the man responsible for our modern use of cut and paste

Links should bring back to the original source not point to some random text, with no context, that needs to be indexed


That's actually a pretty solid idea. A meta tag for topics.

Only issue would be handling inevitable "SEO-ified" abusers of it.


Simple solution: 379dd864b16eaca3ce94c15a6bdfcc73™

Not sure I follow. Are you saying that by trademarking the fingerprint you can prevent SEO abuse?

The problem with any SEO mitigation is that the 128 bit string is intended for SEO. If you make a cool new thing then I blog about it I want to use your 128 bit string and you want me to use it too! So how do you prevent someone else from putting it on a linkfarm? I don't think trademark helps there.


If your page has 100 of these 128 bit strings in them then it's even clearer that its a link farm page that can be downranked

This solves the namespacing problem and allows creators and consumers to use different names if they want. Searching based on the creator's original name for a project becomes a mess because there will be a very large number of HelloWorld applications out there. Interestingly enough the google web store sort of already does this. The issue that comes up fairly quickly though is how to deal with the relationships between different packaged and published versions of what is nomalinally the same code base, or even forks/branches of the same code base. Maintaining a verifiable and discoverable chain for published artifacts without completely confusing users or exposing them to various malicious attacks (change a single byte in the middle of that random string and you have a nice off-by-one attack). Lots of infrastructure would be required to pull this off, but it would be great if it could be built.

I too dream of content addressable web.


That's actually a pretty good idea. Kind of like an official @mention/#hashtag for an exact topic, if somehow wasn't abused by people, would definitely improve related search results. Navigating user intent algorithms is getting more difficult.

Does schema.org etc support ids beyond keywords/categories? I guess the id could just be a keyword.

Maybe a public registry where you claim an id for a topic, similar to claiming a yelp page or an ISBN number. Then anyone posting related content includes that id. Popular topics could be grouped. You could generate memorable ids for most known topics/products/etc, and people just utilize them organically, robots could apply them automatically over time also.

It's especially bad for words with many definitions, like "bridge repair", could mean a dental bridge, guitar bridge, or a bridge over a lake.


Just added one to a project of mine[0] (just a YouTube browser using the RSS thing YouTube does). Hope it catches on!

[0] - https://github.com/devenblake/ytfeed.py


I kind of do this in blog. Each page is assigned an UUID.

A name that doesn't exist on google? So what exactly would that be?

It is very obvious if you google "pdf pep" python enhancement proposals or pepsi is not going to show up.

Naming is hard, I have a dream that people would stop complaining about it. There is names/acronyms for literally everything, the chance of you finding something unique is very very small.


Google > pdf pep

Nope. First page is mostly CDC, WHO, etc. Nothing about python or this project.


Yep, naming is hard. And I like the name Wine most.

(Wine is not emulator)

Thank you for understanding. :)


“GNU’s Not Unix”

Everybody's Google results are different these days. They put you in a bubble. Hence, saying "if you Google you should see result X or Y" isn't necessarily true.

I would say for acronyms containing 2, 3, 4 letters these are all going to be taken at this point.

What matters is how much do the acronyms overlap. Pepsi (food & drink) has nothing to do with PDF editors (tech).

pEp (or p=p) [1] on Android is a nice K-9 fork with material design and GPG support / opportunistic encryption. Its not very well known though.

Worst would've been if there's a PEP directly related to PDF.

[1] https://www.pep.security


First time seeing a .security website in my life.

We need to find an actionable suggestion. Maybe projects can have a long name and a nick name? To make everyone happy, and of course, to be useful for everyone.

pePDF? Seems relatively unused compared to pep.

Unfortunately that will never happen.

The good news is google is smart, and if you add a couple of subject keywords it pretty much always works.

For example if you search "pep pdf editor" the site shows up in first place.

My only issue is naming things after words that are so incredibly common they're on practically every page anyways, and thus truly useless for searching. I'm looking at you, Go.


I did not expect googling "pep pdf editor" will show my site in the first place, because i just published the link of my site in a few days.

And google is smart as it as you said.

PEP is also "Politically exposed person" in anti money launcering circles, and often the lists we get are PDFs.

Seems like PEP is used for more things too :)


I study biochemistry, so “phosphoenolpyruvate” was the first thing to come to mind, ha.

Also Python Enhancement Proposal

that was already pointed out upthread from you

In previous times, I worked for a company that had the acronym AAPL. I kept getting the stock quotes for Apple every time I browse the company's intranet.

I once had to google something about the Thread protocol. Impossible. I think I haven't seen a worse name for a tech project.

Keep dreaming. :)

I usually solve this by giving the big G (or the big Duck) more info to work with in the query: https://duckduckgo.com/?q=pep+pdf+mac

I have a dream that dang will automate flagging and removing the inevitable inane comments about name uniqueness every time someone posts a project on HN.

Once you end up investigating an issue with something that namesquats another thing you'll understand.

Yes, having only been doing this twenty years, I'm unlikely to be versed in doing that.

It's like these companies can't hire marketing reps to establish a brand image that's new

Legal | privacy