Taprun source ↗
Entry · facebook

facebook/keyword-search

Search Facebook posts by keyword using the user's logged-in session. Unscrambles author names that Facebook obfuscates via Flexbox order reordering. Returns author, body text, engagement counts, and a stable content hash when no native post permalink is exposed.

read-only source: DOM [role=article] + Flexbox-order unscramble

Run it

$ tap facebook/keyword-search {"keyword":"AI automation","limit":5}

What it returns

ColumnType
post_idtext
author_nametext
author_urltext
texttext
posted_attext
urltext
like_counttext
comment_counttext
share_counttext
permalinktext
langtext

Arguments

NameTypeDefaultDescription
keywordstringSearch keyword
limitint20Max posts to return

Why this tap exists

Facebook's /search/posts/?q=… page is one of the more hostile scraping targets on the open web. It requires login, ships no llms.txt or RSS, rotates internal GraphQL doc_ids weekly, and — most interestingly — applies Flexbox-order DOM scrambling to author display names.

When a naive scraper reads document.querySelectorAll('[role="article"]')[0].textContent, the first block of characters looks like random noise (oSodnprmmlffgfi1c3mSg…) followed by readable body text. Many give up here and declare the site un-scrapable. They are wrong.

The gibberish is author name characters split across many <span>s, each assigned a non-zero CSS order. The browser's flexbox layout re-sorts them visually; textContent returns DOM-order, which is randomized. Post body, engagement counts, and aria-labels are not scrambled.

This tap queries the search page via the user's Chrome session, walks each [role=article], and for scrambled containers (all children have length ≤ 2 and at least one non-zero order) it sorts by computed order and re-concatenates the text. Everything else is plain extraction.

Sample output

post_id      author_name   author_url                              text                                                 like_count  lang
fb_74ig3q    Snowie.Ai     https://www.facebook.com/SnowieAi       In 24 months every serious website will talk…        500         en

Known gaps. Facebook search cards do not expose a public post permalink href — the visible links are profile URLs plus encrypted __cft__ tracking params. When no native post ID is found, this tap emits a content-hash id (fb_…) that is stable across runs for the same author+body combination, so downstream deduplication still works.

posted_at, comment_count, and share_count are best-effort: Facebook omits or obfuscates these on some search result layouts. The tap returns empty strings rather than failing the row — use tap verify to check health drift per field if you need one of them to be non-empty.

Runtime

Chrome bridge only. Facebook blocks anonymous and non-browser clients at the search endpoint.

tap runtime chrome
tap facebook/keyword-search keyword="AI automation" limit=5

Case study

Full write-up — hypothesis, wrong diagnosis, the 5-minute diagnostic, the fix — in Facebook Scrambles Author Names, Not Post Bodies.

Provenance

Source
Machine format
keyword-search.jsonld — W3C Annotation
Last verify run
never
License
MIT