Files
storyforge/scripts/douyin-browser-capture/README.md

2.8 KiB

Douyin Browser Capture

This tool drives a real Playwright Chromium session, lets a human log into Douyin, captures the loaded profile and work pages, and can sync the captured bundle into StoryForge's existing /v2/douyin/accounts/sync endpoint.

Install

cd /Users/kris/code/StoryForge-gitea/scripts/douyin-browser-capture
npm install
npx playwright install chromium

Run

cd /Users/kris/code/StoryForge-gitea/scripts/douyin-browser-capture
npm run capture -- \
  --profile-url https://www.douyin.com/user/your_account \
  --storyforge-username kris \
  --storyforge-password 'Asd123456.'

The browser uses a persistent state directory under ~/.storyforge/douyin-playwright, so Douyin login can survive between runs.

Control Panel

If you do not want to remember CLI arguments, start the local control panel:

cd /Users/kris/code/StoryForge-gitea/scripts/douyin-browser-capture
npm run control-panel

Then open http://127.0.0.1:3618 and use this flow:

  1. Fill in the Douyin profile URL and StoryForge credentials.
  2. Click 开始采集.
  3. A real Chromium window opens. Log into Douyin and solve any captcha there.
  4. Return to the control panel and click 已完成登录,继续采集.
  5. Wait for summary.json and the optional StoryForge sync result.

The control panel stores each run under:

/Users/kris/code/StoryForge-gitea/output/playwright/douyin/control-panel

The StoryForge token field is session-scoped in the browser and is not written back into the saved form values, so it will not be refilled from localStorage on the next launch.

What it captures

  • current profile page JSON blobs extracted from <script> tags
  • selected window globals such as __INITIAL_STATE__
  • relevant JSON network responses
  • creator-center pages using the same logged-in browser context
  • a limited number of video detail pages linked from the profile

Output

Default output directory:

/Users/kris/code/StoryForge-gitea/output/playwright/douyin

Each run writes:

  • profile-bundle.json
  • creator-*.json
  • video-*.json
  • storyforge-sync-request.json
  • storyforge-sync-response.json when sync is enabled
  • summary.json

Notes

  • This is designed as a browser-assisted capture flow, not a fully headless anti-bot bypass.
  • If Douyin shows a slider or challenge page, solve it manually in the opened browser window and then continue.
  • Use --no-sync if you only want to save a local bundle for inspection.
  • Use --ready-file /tmp/storyforge-ready.signal if you want another process or webpage to decide when capture continues.
  • Creator-center pages belong to the currently logged-in Douyin account. StoryForge now treats them as supplemental evidence by default and will not let them overwrite the target profile unless you explicitly pass --allow-creator-center-fallback.