Files
storyforge/scripts/douyin-browser-capture/README.md

78 lines
2.8 KiB
Markdown

# Douyin Browser Capture
This tool drives a real Playwright Chromium session, lets a human log into Douyin, captures the loaded profile and work pages, and can sync the captured bundle into StoryForge's existing `/v2/douyin/accounts/sync` endpoint.
## Install
```bash
cd /Users/kris/code/StoryForge-gitea/scripts/douyin-browser-capture
npm install
npx playwright install chromium
```
## Run
```bash
cd /Users/kris/code/StoryForge-gitea/scripts/douyin-browser-capture
npm run capture -- \
--profile-url https://www.douyin.com/user/your_account \
--storyforge-username kris \
--storyforge-password 'Asd123456.'
```
The browser uses a persistent state directory under `~/.storyforge/douyin-playwright`, so Douyin login can survive between runs.
## Control Panel
If you do not want to remember CLI arguments, start the local control panel:
```bash
cd /Users/kris/code/StoryForge-gitea/scripts/douyin-browser-capture
npm run control-panel
```
Then open [http://127.0.0.1:3618](http://127.0.0.1:3618) and use this flow:
1. Fill in the Douyin profile URL and StoryForge credentials.
2. Click `开始采集`.
3. A real Chromium window opens. Log into Douyin and solve any captcha there.
4. Return to the control panel and click `已完成登录,继续采集`.
5. Wait for `summary.json` and the optional StoryForge sync result.
The control panel stores each run under:
`/Users/kris/code/StoryForge-gitea/output/playwright/douyin/control-panel`
The StoryForge token field is session-scoped in the browser and is not written back into the saved form values, so it will not be refilled from localStorage on the next launch.
## What it captures
- current profile page JSON blobs extracted from `<script>` tags
- selected window globals such as `__INITIAL_STATE__`
- relevant JSON network responses
- creator-center pages using the same logged-in browser context
- a limited number of video detail pages linked from the profile
## Output
Default output directory:
`/Users/kris/code/StoryForge-gitea/output/playwright/douyin`
Each run writes:
- `profile-bundle.json`
- `creator-*.json`
- `video-*.json`
- `storyforge-sync-request.json`
- `storyforge-sync-response.json` when sync is enabled
- `summary.json`
## Notes
- This is designed as a browser-assisted capture flow, not a fully headless anti-bot bypass.
- If Douyin shows a slider or challenge page, solve it manually in the opened browser window and then continue.
- Use `--no-sync` if you only want to save a local bundle for inspection.
- Use `--ready-file /tmp/storyforge-ready.signal` if you want another process or webpage to decide when capture continues.
- Creator-center pages belong to the currently logged-in Douyin account. StoryForge now treats them as supplemental evidence by default and will not let them overwrite the target profile unless you explicitly pass `--allow-creator-center-fallback`.