Hacker Read

Fomite · 2018-08-17 22:44:39+00:00

For the record, I'm working on a topological data analysis project that is in some ways quite similar to this, and I'd be astonished if a single community hospital has the depth needed. We're working with thirty-some.

jburroni | karma 9 | avg karma 1.8 · | 2014-12-05 14:42:04+00:00

Thank you for your feedback. I din't know that topological data analysis even exist! I have some published and unpublished research on the topic. I had used this particular dataset by random. I wanted to use something with integers, large sample size and with high dimension.

I'm glad you did the community detection stuff. I found it useful when the dimension is really large (in the order of thousand variables). For instance, when doing text mining. very nice comment!

reply

tomrod | karma 14581 | avg karma 2.4 · | 2019-12-02 01:15:57+00:00

This seems to have crossover with topological data analysis.

wenc | karma 8549 | avg karma 3.38 · | 2019-03-21 16:14:11

Topological Data Analysis was the next big thing a couple of years ago, and it certainly has its applications [1], but the breadth of its applicability may have been overhyped and the energy has fizzled somewhat.

[1] https://en.wikipedia.org/wiki/Topological_data_analysis#Appl...

reply

programnature | karma 1478 | avg karma 8.69 · | 2015-10-15 13:41:59

Topological data analysis is amazing, too bad all the hype around DL is leaving it in relative obscurity.

taliesinb | karma 2450 | avg karma 3.15 · | 2014-12-05 07:07:18

The most well-known practitioners of this sort of 'topological' approach are Ayasdi, they have some slick demos [1]. The general name for this idea is topological data analysis [2].

I replicated this particular experiment in WL, of course, because it's a 5-minute thing to do [3], and I could actually do the community detection the author alluded to.

But I noticed that the correlation matrix itself is much more suggestive than the graph ends up being, with or without community detection. Take a look at the correlation matrix (note that MatrixPlot does some clever combination of rank and absolute value to get high dynamic range):

http://imgur.com/WKn029o

The tri-diagonal structure is because the original dataset is derived from the pixel counts from successive 4x4 tiles on NIST written-digit images [4].

Those 8x8 matrix of tiles is flattened onto the 64 random variables, so the large correlation with tiles on the left and right explain the 1-off-diagonal orange lines, the other two diagonals are offset by 8 and correspond the high correlation with the tiles above and below. That's the 'connectivity kernel' of a 2D manifold, so to speak.

The curious squiggles in all the other blocks of this matrix are unusual. I don't know what's going on there. Maybe something interesting.

[1] http://www.ayasdi.com/

[2] http://en.wikipedia.org/wiki/Topological_data_analysis

[3] https://www.wolframcloud.com/objects/c7927909-448d-4502-9c1a...

[4] https://archive.ics.uci.edu/ml/machine-learning-databases/op...

reply

microcolonel | karma 4436 | avg karma 0.84 · | 2019-10-18 13:06:04+00:00

Wonder if there's some more ideal way to lay out those so they are less of a tangled web. This graph makes me want an adjacency list.

craggyjaggy | karma 59 | avg karma 3.47 · | 2022-04-14 13:18:34

The total is about a million nodes, though I'm fairly sure the most interesting data is a subset of about 50,000 nodes. Each node has on the order of 100 edges. Since that's still a lot I'll have to rethink my plan I'm afraid.

improbable22 | karma 677 | avg karma 2.1 · | 2019-03-26 23:57:24+00:00

Thanks. I skimmed the linked "Mapper" article they cite as their method, and it looks about as topological as t-SNE, i.e. in the sense of caring about local nearness but not global distance.

But did I miss any heavier stuff? Is this all people mean when they talk about topological data analysis?

reply

rd108 | karma 267 | avg karma 2.7 · | 2015-10-20 09:53:13+00:00

Connectomics has been using graph theory for a while, but this is cool. Seems like more algebraic topology methods will percolate in (hah!) over time.

liorben-david | karma 114 | avg karma 1.27 · | 2023-04-24 10:28:41

The paper seems to be using homology as the topological feature here. I've done some work in Topological Data Analysis before and it feels like the hidden issue is that computing homology is generally very inefficient(Since it usually amounts to reducing an nxn matrix).

It definitely feels like graphs/topology should be helpful tools to work with data(Since graph-like structures are good representations of the real world), but we need to solve this efficiency issue before this can be possible.

Also to address the confusion on how category theory comes into it, category theory studies abstract structures where you have objects and relationships between these objects. A lot of algebraic topology(Which is the sort of topology relevant here) is built in the language of category theory(Either by neccesity or by convention).

reply

judis | karma 2 | avg karma 2.0 · | 2018-01-02 08:35:21

Topological data analysis[0] is a new application of topology.

[0] https://en.wikipedia.org/wiki/Topological_data_analysis

reply

jimbokun | karma 17789 | avg karma 3.01 · | 2020-08-11 13:04:31+00:00

I would say they're not even hierarchical in many cases, but an unconstrained graph.

jboggan | karma 4818 | avg karma 6.04 · | 2012-06-17 16:47:35

I'm nearly salivating at the thought of writing graph algorithms at that scale . . . and actually having the outcome mean something and be acted upon in a timely fashion. It sounds like a dream job to me. That scale and depth of information is a very powerful tool, no doubt, and it should be wielded for a good purpose. This article at least encourages me that people are thinking beyond the bottom line on these issues. Awesome.

chippy | karma 10184 | avg karma 4.63 · | 2020-03-03 09:53:36+00:00

I really like the network graph showing relationships between clusters and patients

wheels | karma 19055 | avg karma 6.57 · | 2008-03-27 05:22:31+00:00

For the latest data set that I'm working on there are 5 million nodes and 50 million edges, and each one has some meta-data associated with it. :-)

eli_gottlieb | karma 9637 | avg karma 1.3 · | 2019-09-11 18:10:07+00:00

Yeah, now I'm wondering what topological skeletons might have in common with the abstract simplicial complexes generated by running a persistent homology algorithm on a point cloud.

endtime | karma 6487 | avg karma 2.92 · | 2010-04-05 03:14:40+00:00

Cool. I'd love to see the nodes arranged geographically.

umutisik | karma 344 | avg karma 4.2 · | 2015-11-20 16:03:12+00:00

Also very cool is the recognition of branching in the data by the computation of a persistent Borel-Moore homology. This is the method that was used in their cancer study.

lone_haxx0r | karma 1071 | avg karma 2.92 · | 2019-07-01 20:34:04

Topology will finally have a practical application.