78 lines
2.8 KiB
Markdown
78 lines
2.8 KiB
Markdown
# Douyin Browser Capture
|
|
|
|
This tool drives a real Playwright Chromium session, lets a human log into Douyin, captures the loaded profile and work pages, and can sync the captured bundle into StoryForge's existing `/v2/douyin/accounts/sync` endpoint.
|
|
|
|
## Install
|
|
|
|
```bash
|
|
cd /Users/kris/code/StoryForge-gitea/scripts/douyin-browser-capture
|
|
npm install
|
|
npx playwright install chromium
|
|
```
|
|
|
|
## Run
|
|
|
|
```bash
|
|
cd /Users/kris/code/StoryForge-gitea/scripts/douyin-browser-capture
|
|
npm run capture -- \
|
|
--profile-url https://www.douyin.com/user/your_account \
|
|
--storyforge-username kris \
|
|
--storyforge-password 'Asd123456.'
|
|
```
|
|
|
|
The browser uses a persistent state directory under `~/.storyforge/douyin-playwright`, so Douyin login can survive between runs.
|
|
|
|
## Control Panel
|
|
|
|
If you do not want to remember CLI arguments, start the local control panel:
|
|
|
|
```bash
|
|
cd /Users/kris/code/StoryForge-gitea/scripts/douyin-browser-capture
|
|
npm run control-panel
|
|
```
|
|
|
|
Then open [http://127.0.0.1:3618](http://127.0.0.1:3618) and use this flow:
|
|
|
|
1. Fill in the Douyin profile URL and StoryForge credentials.
|
|
2. Click `开始采集`.
|
|
3. A real Chromium window opens. Log into Douyin and solve any captcha there.
|
|
4. Return to the control panel and click `已完成登录,继续采集`.
|
|
5. Wait for `summary.json` and the optional StoryForge sync result.
|
|
|
|
The control panel stores each run under:
|
|
|
|
`/Users/kris/code/StoryForge-gitea/output/playwright/douyin/control-panel`
|
|
|
|
The StoryForge token field is session-scoped in the browser and is not written back into the saved form values, so it will not be refilled from localStorage on the next launch.
|
|
|
|
## What it captures
|
|
|
|
- current profile page JSON blobs extracted from `<script>` tags
|
|
- selected window globals such as `__INITIAL_STATE__`
|
|
- relevant JSON network responses
|
|
- creator-center pages using the same logged-in browser context
|
|
- a limited number of video detail pages linked from the profile
|
|
|
|
## Output
|
|
|
|
Default output directory:
|
|
|
|
`/Users/kris/code/StoryForge-gitea/output/playwright/douyin`
|
|
|
|
Each run writes:
|
|
|
|
- `profile-bundle.json`
|
|
- `creator-*.json`
|
|
- `video-*.json`
|
|
- `storyforge-sync-request.json`
|
|
- `storyforge-sync-response.json` when sync is enabled
|
|
- `summary.json`
|
|
|
|
## Notes
|
|
|
|
- This is designed as a browser-assisted capture flow, not a fully headless anti-bot bypass.
|
|
- If Douyin shows a slider or challenge page, solve it manually in the opened browser window and then continue.
|
|
- Use `--no-sync` if you only want to save a local bundle for inspection.
|
|
- Use `--ready-file /tmp/storyforge-ready.signal` if you want another process or webpage to decide when capture continues.
|
|
- Creator-center pages belong to the currently logged-in Douyin account. StoryForge now treats them as supplemental evidence by default and will not let them overwrite the target profile unless you explicitly pass `--allow-creator-center-fallback`.
|