20,000 Roam Tags with Spacy

I am not an experienced `#roamcult` aficionado, but I do think Roam is pretty cool.

If [[double bracket]] backlinks are what gets you in the door — then I’ve brought a crapton of baggage with me into the doorway so far.

→ My basic idea when wrapping my head around Roam the last month:

I really like Blinkist for book summaries.
I really like Smash Notes and PodcastNotes for podcast summaries.
I would love to see all the connections between the different summaries.

My General Roam Strategy

Unlike some of the more advanced Roam tutorials, I’ve actually brought a lot of my old system over with me.

→ I `#hashtag` quite a bit.

In general, I default to a combination of tags to keep general/interesting collections:

#interesting
#quote
#science

I tend to put these tags at the end of blocks:

Cool block of text goes here. #interesting #quote

Or at the top of a page in a Keywords “attribute”.

→ I braindump a bunch of random thoughts into Daily Notes.

Kudos to Roam: it’s really the only Daily Notes system that I have felt comfortable with — even after trying to regularly use Day One, iA Writer, etc for years.

→ Tim Harford’s Messy book and Steven Whitaker’s email study influenced my ideas on folders/searching/tags/etc.

Whittaker and his colleagues got permission to install logging software on the computers of several hundred IBM office workers, and tracked around 85,000 attempts to find email by clicking through folders, or by using ad hoc methods.

Whittaker found that clicking through a folder tree took almost a minute, while simply searching took just 17 seconds. People who relied on folders took longer to find what they were looking for, but their hunts for the right email were no more or less successful.

In other words: if you just dump all your email into a folder called “archive,” you will find emails more quickly than if you hide them in a tidy structure of folders.

Adding Blinkist and podcast notes to Roam

“It seemed like a nice neighborhood to have bad habits in.”
— Raymond Chandler

Again, my basic idea after wrapping my head around Roam:

I really like Blinkist for book summaries.
I really like Smash Notes and PodcastNotes for podcast summaries.
I would love to see all the connections between the different summaries.

→ How would I classify “connections between” summaries?

When I was importing my first few summaries and blog posts into Roam, I found I was most interested in PEOPLE, COMPANIES, GEOGRAPHIES, and WORKS_OF_ART.

These all just happen to be Spacy Named Entities.

→ I like the gist of things for most podcasts.

For instance: I really like some of Scott Adams’ ideas — particularly those in How to Fail at Almost Everything and Still Win Big — but I don’t need to hear “the power of hypnosis” for the 100th time.

Automated Tagging with `Spacy`

→ In the end, I used `Spacy` for tagging `PEOPLE` and `WORKS_OF_ART`.

Spacy’s GEOGRAPHY and ORGANIZATION tagging ended up not fitting my use case — so I excluded it from the final version.

It’s pretty simple Python code for adding double brackets:

import spacy
nlp = spacy.load("en_core_web_sm")

def add_brackets_to_proper_nouns(text):
    doc = nlp(text)
    for ent in doc.ents:
        if ent.label_ == "PERSON" and " " in ent.text:
            text = text.replace(ent.text, f"[[{ent.text}]]")
        elif ent.label_ == "WORK_OF_ART":
            text = text.replace(ent.text, f"[[{ent.text}]]")
    return text.strip()

And the Roam JSON format is a breeze to create:

{
  "title": "20,000 Roam Tags and Counting",
  "children": [{
    "string": "Super simple to use."
  }, {
    "string": "Easy to create."
  }]
}

→ It’s messy, but Roam is pretty useful.

I’ve found Roam more useful reading through summaries and finding connections than the Jupyter notebooks I was using.

A Few Interesting Connections

Roam has slowed down for me (it’s obviously sluggish to get the page to load sometimes), but I have found some cool cross linking of ideas.

→ Few authors properly credit Bill Klann for Ford’s assembly line.

Henry Ford is one of the most popular individuals written about on Blinkist — but only Shortcut does a good job setting the record straight:

Bill Klann applied the slaughtering analogy to automobile assembly lines.
His managers weren’t even big fans of the idea at first.

Klann told Martin that his trip to the slaughterhouse had given him an idea with big potential for Ford. From a process standpoint, he explained, there was no difference between taking things apart and putting things together. So just as Swift disassembled animals on a moving conveyor, couldn’t Ford assemble things using the same, efficient method?

“If they can kill pigs and cows that way, we can build cars that way,” Klann said.

“I don’t believe it,” Martin answered.

Klann was insistent. “They made it work down in Chicago. Why can’t we put them in here for pushing the job along the same way?”

Few recognized they were witnessing the birth of a second Industrial Revolution.

→ Postmodernism kind of sucks.

A few different Blinkist summaries try to tackle why postmodernism isn’t the best philosophy, but I came away loving Alan Sokal’s hoax from Fashionable Nonsense:

The paper was entitled “Transgressing the Boundaries: Towards a Transformative Hermeneutics of Quantum Gravity” and immediately after its publication, Sokal revealed that it was full of postmodernist mumbo-jumbo that had no meaning whatsoever.

To provide evidence, Sokal used cryptically vague and obscure language in his text, such as: “the pi of Euclid and the G of Newton… are now perceived in their ineluctable historicity,” or, “the putative observer becomes fatally de-centered.”

If that sounds completely unintelligible to you, that was Sokal’s point. His hoax illustrated exactly how nonsensical certain areas of postmodernism had become.

By using meaningless jargon to form grammatically correct sentences, and being sure to cite major names in the field of physics, Sokal succeeded in getting his paper published in the fashionable journal, Social Text.

→ Intel has an interesting history — from multiple angles.

It seems like Hacker News loves High Output Management, but turns out that Intel’s history is way more interesting than Andy Grove’s management insights. Here’s a good recap from The Intel Trinity:

Unlike Gordon Moore and Robert Noyce, Grove wasn’t one of Intel’s founding members. He was an employee with a salary, and his financial stability hinged on the success of the company. This made him wary of risky strategies.

It was this fear that made him so skeptical of the company’s move into microprocessors. He wanted Intel to stick to its bread and butter, which was memory chips. Thankfully, he wasn’t able to stop Moore and Noyce, who knew the move was the correct one. Had Grove quashed the idea, Intel certainly would not have become the success it turned out to be.

→ I think I’m a gardener when it comes to notes.

There are two types of writers, the architects and the gardeners.

The architects plan everything ahead of time, like an architect building a house. They know how many rooms are going to be in the house, what kind of roof they’re going to have, where the wires are going to run, what kind of plumbing there’s going to be. They have the whole thing designed and blueprinted out before they even nail the first board up.

The gardeners dig a hole, drop in a seed and water it. They kind of know what seed it is, they know if planted a fantasy seed or mystery seed or whatever. But as the plant comes up and they water it, they don’t know how many branches it’s going to have, they find out as it grows. And I’m much more a gardener than an architect.

— George R.R. Martin

Sarah Tavel has brought up similar ideas in regards to venture capital.

I Also Really Like Automated Emails

I’m not solely relying on Roam for reading summaries. I’m a big believer in waking up to automated emails.

All enterprise software competes with Excel.

All productivity software competes with emailing things to yourself.

— Pavel Samsonov

Nightly automated emails are still my main source of reading Blinkist summaries. (Along with simple CatBoost models for predicting what I like.)