Skip to content

Commit 77a01f8

Browse files
committed
fix CloudFlair challenge
1 parent df69a65 commit 77a01f8

File tree

15 files changed

+13322
-2000
lines changed

15 files changed

+13322
-2000
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,3 +40,6 @@ dist/
4040
tmp/
4141
*.swp
4242
.DS_Store
43+
curl/
44+
.history/
45+
.vscode/

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,3 +48,6 @@ Great thanks to leetcode.com, a really awesome website!
4848
Coding it!
4949
Run test(s) and pray... $ leetcode test ./two-sum.cpp -t '[3,2,4]\n7'
5050
Submit final solution! $ leetcode submit ./two-sum.cpp
51+
52+
## Bypass CloudFlare
53+
Please refer to [Bypass CloudFlare](./cfx.md) for how we use [curl-impersonate](https://github.com/lwthiker/curl-impersonate) based solution to bypass CloudFlare.

cfx.md

Lines changed: 217 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,217 @@
1+
### About CloudFlare
2+
#### What is Cloudflare Bot Management
3+
Cloudflare is a web performance and security company. On the security side, they offer customers a [Web Application Firewall (WAF)](https://www.cloudflare.com/waf/). A WAF can defend applications against several security threats, such as cross-site scripting (XSS), credential stuffing, and DDoS attacks.
4+
5+
One of the core systems included in their WAF is Cloudflare's Bot Manager. As a bot protection solution, its main goal is to mitigate attacks from malicious bots without impacting real users.
6+
7+
About 1/5 of websites you need to scrape use Cloudflare, a hardcore anti-bot protection system that gets you blocked easily.
8+
9+
#### How to bypass it?
10+
Refer to [this question on stackoverflow](https://stackoverflow.com/questions/71529199/where-does-cloudflare-detect-web-and-terminal-requests-on-equal-terms),
11+
Cloudflare uses various techniques to determine whether the user agent is a real browser or not. And, the site owner can also determine the level of risk they can allow via the Cloudflare platform.
12+
Let's discuss a few techniques (I know) used by Cloudflare:
13+
* TLS fingerprinting This is one of the prominent techniques used notoriously by Cloudflare. This is also the reason why tools like native proxy are popular. Link: https://github.com/klzgrad/naiveproxy
14+
* Cookies Cloudflare used to have some cf_ related cookies which are used to distinguish real users or not.
15+
16+
And, these are only a few techniques. Cloudflare has many more.
17+
18+
After many tests, the proxy based solutions (naive, FlairSolverr) don't really work! The [curl-impersonate](https://github.com/lwthiker/curl-impersonate) based solution works well. This includes these steps:
19+
* install curl-impersonate
20+
* compile node-libcurl with curl-impersonate
21+
* [test](curl/README.md)
22+
* rewrite leetcode-cli to replace request with modified node-libcurl
23+
* update the vscode leetcode plugin(extension) to use the enhanced leetcode-cli plugin
24+
25+
Finally, we use the curl_chrome116 command line + exec as the solution, because the NODE_MODULE_VERSION incompatibility issue in vscode. The vscode itself is built by electron, but we build the modified version of node-libcurl with node (18.12.0). Although in vscode leetcode extension, it spawns a separate node process (18.12.0) to run the underlying leetcode commands, we still got the NODE_MODULE_VERSION error.
26+
27+
```
28+
const childProc = wsl.useWsl()
29+
? cp.spawn("wsl", [leetCodeExecutor_1.leetCodeExecutor.node, leetCodeBinaryPath, "user", commandArg], { shell: true })
30+
: cp.spawn(leetCodeExecutor_1.leetCodeExecutor.node, [leetCodeBinaryPath, "user", commandArg], {
31+
shell: true,
32+
env: cpUtils_1.createEnvOption(),
33+
});
34+
35+
this.executeCommandEx(this.nodeExecutable, [yield this.getLeetCodeBinaryPath(), "plugin", "-e", plugin]);
36+
```
37+
38+
### Install curl-impersonate
39+
Refer to [INSTALL.md](https://github.com/lwthiker/curl-impersonate/blob/main/INSTALL.md#macos)
40+
+ install prebuild binary through brew
41+
```
42+
brew tap shakacode/brew
43+
brew install curl-impersonate
44+
```
45+
+ or compile & install from source code
46+
```
47+
# Install dependencies for building all the components:
48+
brew install pkg-config make cmake ninja autoconf automake libtool
49+
# For the Firefox version only
50+
brew install sqlite nss
51+
pip3 install gyp-next
52+
# For the Chrome version only
53+
brew install go
54+
55+
# Clone the repository
56+
git clone https://github.com/lwthiker/curl-impersonate.git
57+
cd curl-impersonate
58+
59+
# Configure and compile
60+
mkdir build && cd build
61+
../configure
62+
# Build and install the Firefox version
63+
gmake firefox-build
64+
sudo gmake firefox-install
65+
# Build and install the Chrome version
66+
gmake chrome-build
67+
sudo gmake chrome-install
68+
# Optionally remove all the build files
69+
cd ../ && rm -Rf build
70+
```
71+
72+
### Compile node-libcurl with curl-impersonate
73+
Build node-libcurl from source on macOS
74+
```
75+
76+
# install the build tool node-gyp
77+
npm i -g node-pre-gyp node-gyp
78+
# build & install node-libcurl from source, first time (to generate build files)
79+
# npm_config_build_from_source=true npm i node-libcurl
80+
# use yarn as npm doesn't create build folders and make files!
81+
npm_config_build_from_source=true yarn add node-libcurl
82+
npm_config_build_from_source=true npm_config_curl_static_build=false yarn add node-libcurl
83+
# static build runs successfully, but got missing symbol(dyld) at runtime (TODO)
84+
npm_config_build_from_source=true npm_config_curl_static_build=true yarn add node-libcurl
85+
86+
# got below error:
87+
# npm ERR! clang: error: no such file or directory: '/usr/include'
88+
# modify below make file and remove all /usr/include, save it
89+
vi ./node_modules/node-libcurl/build/node_libcurl.target.mk
90+
91+
# for static build, got below errors:
92+
# clang: error: no such file or directory: '/usr/lib/libcurl.@libext@'
93+
# clang: error: no such file or directory: '@LDFLAGS@'
94+
# clang: error: no such file or directory: '@LIBCURL_LIBS@'
95+
96+
# or we can do following tricks to "modify" curl-config
97+
# because build/config.gypi & build/node_libcurl.target.mk are generated based on curl-config
98+
cp /usr/bin/curl-config /usr/local/bin/curl-config
99+
vi /usr/local/bin/curl-config
100+
# modify & save, make sure /usr/local/bin is before /usr/bin in PATH env var
101+
# reload the shell: source ~/.zshrc
102+
103+
# then build it again with node-gyp
104+
cd ./node_modules/node-libcurl
105+
node-gyp build
106+
107+
# verify the lib/binding/node_libcurl.node file
108+
otool -L lib/binding/node_libcurl.node
109+
110+
lib/binding/node_libcurl.node:
111+
/usr/lib/libcurl.4.dylib (compatibility version 7.0.0, current version 9.0.0)
112+
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 1600.157.0)
113+
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1336.61.1)
114+
```
115+
![node_libcurl.node dylib dependencies](docs/otool2.png)
116+
117+
```
118+
# also noticed the reported version is changed
119+
# console.log(Curl.getVersionInfo());
120+
node curl/lc.js
121+
```
122+
![Curl.getVersionInfo()](docs/test2a.png)
123+
124+
```
125+
# run test, it works!
126+
node curl/lc.js
127+
128+
### questions ???
129+
# otool -L node_modules/node-libcurl/lib/binding/node_libcurl.node
130+
to find out which libcurl.4.dylib is loaded (as /usr/lib/libcurl.4.dylib doesn't exist!) ?
131+
otool
132+
lib/binding/node_libcurl.node:
133+
/usr/lib/libcurl.4.dylib (compatibility version 7.0.0, current version 9.0.0)
134+
135+
# DYLD_PRINT_LIBRARIES=1 node curl/lc.js
136+
... ...
137+
dyld[69574]: <4528259C-8493-3A0C-8B35-F29E87F59EED> /Users/harry/test/leetcode-cli/node_modules/node-libcurl/lib/binding/node_libcurl.node
138+
dyld[69574]: <90815EBD-89C8-33E7-8B86-5A024176BC15> /usr/lib/libcurl.4.dylib
139+
... ...
140+
looks like macOS has some special mapping when loading /usr/lib/libcurl.4.dylib (in memory or cache?)
141+
142+
# references
143+
# https://github.com/lwthiker/curl-impersonate
144+
# https://github.com/lwthiker/curl-impersonate#libcurl-impersonate
145+
# https://github.com/lwthiker/curl-impersonate/issues/80#issuecomment-1166192854
146+
# https://github.com/JCMais/node-libcurl?tab=readme-ov-file#building-on-macos
147+
```
148+
149+
### Test Conclusion
150+
| Not working | Working |
151+
| ----------- | ----------- |
152+
| original node-libcurl | curl-impersonate |
153+
| naive proxy | node exec + curl-impersonate |
154+
| | modified node-libcurl |
155+
156+
### TODO
157+
- Fix the NODE_MODULE_VERSION error by building node-libcurl [with electron](https://github.com/JCMais/node-libcurl?tab=readme-ov-file#electron-aka-atom-shell)
158+
```
159+
Failed to list problems: Error: The module '/Users/harry/.vscode/extensions/leetcode.vscode-leetcode-0.18.1/node_modules/vsc-leetcode-cli/node_modules/node-libcurl/lib/binding/node_libcurl.node' was compiled against a different Node.js version using NODE_MODULE_VERSION 108. This version of Node.js requires NODE_MODULE_VERSION 118. Please try re-compiling or re-installing the module (for instance, using `npm rebuild` or `npm install`)..
160+
```
161+
- Try different install locations for the node-libcurl
162+
163+
### Update the vscode leetcode extension
164+
165+
### References
166+
- [Could not login with both 'leetcode user -l' and 'leetcode user -c'](https://github.com/skygragon/leetcode-cli/issues/218)
167+
- [Cannot login with premium account](https://github.com/skygragon/leetcode-cli/issues/194)
168+
- [Failed to log in with a leetcode.com account](https://github.com/LeetCode-OpenSource/vscode-leetcode/issues/478), [a comment](https://github.com/LeetCode-OpenSource/vscode-leetcode/issues/478#issuecomment-564757098)
169+
- Proxy Server to bypass Cloudflare: [FlareSolverr](https://github.com/FlareSolverr/FlareSolverr), [naiveproxy](https://github.com/klzgrad/naiveproxy)
170+
- [How To Bypass Cloudflare in 2024](https://scrapeops.io/web-scraping-playbook/how-to-bypass-cloudflare/)
171+
- [How to Bypass Cloudflare in 2024: The 8 Best Methods](https://www.zenrows.com/blog/bypass-cloudflare)
172+
- [How to bypass Cloudflare when web scraping in 2024](https://scrapfly.io/blog/how-to-bypass-cloudflare-anti-scraping/)
173+
- [node abi versions](https://github.com/nodejs/node/blob/main/doc/abi_version_registry.json)
174+
175+
### Archived notes
176+
```
177+
# node-gyp related files:
178+
~/Library/Caches/node-gyp/20.11.1/include/node/common.gypi
179+
~/Library/Caches/node-gyp/20.11.1/include/node/config.gypi
180+
./node_modules/node-gyp/addon.gypi
181+
./node_modules/node-libcurl/build/config.gypi
182+
./node_modules/node-libcurl/build/node_libcurl.target.mk
183+
184+
# other stuff (useless)
185+
# no build running
186+
npm install node-libcurl --verbose --build-from-source --curl_static_build=false --update-binary
187+
188+
# rebuild the node_libcurl.node binding
189+
npm rebuild node-libcurl --update-binary
190+
191+
export @LDFLAGS@="-L/usr/local/lib -L$(xcrun --show-sdk-path)/usr/lib -L/usr/lib"
192+
export @LIBCURL_LIBS@="-L/usr/local/opt/curl/lib"
193+
194+
export CFLAGS="-I/usr/local/include"
195+
export CXXFLAGS="-I/usr/local/include"
196+
export CPPFLAGS="-I/usr/local/include"
197+
export LDFLAGS="-L/usr/local/lib -L/usr/local/Cellar/curl/0.6.0-alpha.1/lib"
198+
export LIBRARY_PATH="/usr/local/lib -L/usr/local/Cellar/curl/0.6.0-alpha.1/lib"
199+
$(xcrun --show-sdk-path)/usr/include
200+
# Set environment variables for include and lib directories
201+
export CURL_INCLUDE_DIR=/usr/local/Cellar/curl/8.6.0/include/curl
202+
export CURL_LIB_DIR=/usr/local/Cellar/curl-impersonate/0.6.0-alpha.1/lib
203+
204+
# use the default macOS clang compiler! do NOT use other compilers
205+
CC=gcc-13 CXX=g++-13 npm_config_build_from_source=true yarn add node-libcurl
206+
CC=llvm-gcc CXX=llvm-g++ npm_config_build_from_source=true yarn add node-libcurl
207+
208+
npm install node-libcurl --build-from-source --curl_libraries='-Wl,-rpath /usr/local/lib -lcurl'
209+
npm install node-libcurl --build-from-source --curl_libraries='-Wl,-rpath /usr/local/lib -lcurl-impersonate-chrome'
210+
211+
leetcode-cli locations:
212+
- ~/.nvm/versions/node/v18.12.0/lib/node_modules/leetcode-cli-plugins
213+
- ~/.nvm/versions/node/v18.12.0/lib/node_modules/vsc-leetcode-cli
214+
- ~/.nvm/versions/node/v18.12.0/bin/leetcode
215+
- ~/.vscode/extensions/leetcode.vscode-leetcode-0.18.1/
216+
- ~/.vscode/extensions/leetcode.vscode-leetcode-0.18.1/node_modules/vsc-leetcode-cli/
217+
```

docs/otool2.png

234 KB
Loading

docs/test2a.png

380 KB
Loading

lc

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
#!/usr/bin/env node
2+
3+
require('./lib/cli').run();

lib/commands/user.js

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -112,7 +112,7 @@ cmd.handler = function(argv) {
112112
prompt.message = '';
113113
prompt.start();
114114
prompt.get([
115-
{name: 'login', required: true},
115+
{name: 'login', required: true, default: 'harrysun2006@gmail.com'},
116116
{name: 'cookie', required: true}
117117
], function(e, user) {
118118
if (e) return log.fail(e);

lib/log.js

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ log.init = function() {
5252
const args = Array.from(arguments);
5353
if (name !== 'INFO') args.unshift('[' + name + ']');
5454

55-
let s = args.map(x => x.toString()).join(' ');
55+
let s = args.map(x => x ? x.toString() : '').join(' ');
5656
if (level.color) s = chalk[level.color](s);
5757

5858
this.output(s);

lib/plugins/leetcode.js

Lines changed: 10 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@
22
var util = require('util');
33

44
var _ = require('underscore');
5-
var request = require('request');
65
var prompt = require('prompt');
6+
const { request } = require('../util');
77

88
var config = require('../config');
99
var h = require('../helper');
@@ -27,7 +27,7 @@ plugin.signOpts = function(opts, user) {
2727
};
2828

2929
plugin.makeOpts = function(url) {
30-
const opts = {};
30+
const opts = {method: 'GET'};
3131
opts.url = url;
3232
opts.headers = {};
3333

@@ -86,16 +86,14 @@ plugin.getCategoryProblems = function(category, cb) {
8686
e = plugin.checkError(e, resp, 200);
8787
if (e) return cb(e);
8888

89-
const json = JSON.parse(body);
90-
9189
// leetcode permits anonymous access to the problem list
9290
// while we require login first to make a better experience.
93-
if (json.user_name.length === 0) {
91+
if (body.user_name.length === 0) {
9492
log.debug('no user info in list response, maybe session expired...');
9593
return cb(session.errors.EXPIRED);
9694
}
9795

98-
const problems = json.stat_status_pairs
96+
const problems = body.stat_status_pairs
9997
.filter((p) => !p.stat.question__hide)
10098
.map(function(p) {
10199
return {
@@ -109,7 +107,7 @@ plugin.getCategoryProblems = function(category, cb) {
109107
percent: p.stat.total_acs * 100 / p.stat.total_submitted,
110108
level: h.levelToName(p.difficulty.level),
111109
starred: p.is_favor,
112-
category: json.category_slug
110+
category: body.category_slug
113111
};
114112
});
115113

@@ -143,7 +141,7 @@ plugin.getProblem = function(problem, needTranslation, cb) {
143141
' }',
144142
'}'
145143
].join('\n'),
146-
variables: {titleSlug: problem.slug},
144+
variables: {titleSlug: problem.slug},
147145
operationName: 'getQuestionDetail'
148146
};
149147

@@ -228,7 +226,7 @@ function verifyResult(task, queue, cb) {
228226
e = plugin.checkError(e, resp, 200);
229227
if (e) return cb(e);
230228

231-
let result = JSON.parse(body);
229+
let result = body;
232230
if (result.state === 'SUCCESS') {
233231
result = formatResult(result);
234232
_.extendOwn(result, task);
@@ -392,8 +390,7 @@ plugin.getFavorites = function(cb) {
392390
e = plugin.checkError(e, resp, 200);
393391
if (e) return cb(e);
394392

395-
const favorites = JSON.parse(body);
396-
return cb(null, favorites);
393+
return cb(null, body);
397394
});
398395
};
399396

@@ -533,7 +530,7 @@ plugin.login = function(user, cb) {
533530
});
534531
};
535532

536-
function parseCookie(cookie, body, cb) {
533+
function parseCookie(cookie, cb) {
537534
const SessionPattern = /LEETCODE_SESSION=(.+?)(;|$)/;
538535
const csrfPattern = /csrftoken=(.+?)(;|$)/;
539536
const reCsrfResult = csrfPattern.exec(cookie);
@@ -553,7 +550,7 @@ function requestLeetcodeAndSave(request, leetcodeUrl, user, cb) {
553550
if (redirectUri !== config.sys.urls.leetcode_redirect) {
554551
return cb('Login failed. Please make sure the credential is correct.');
555552
}
556-
const cookieData = parseCookie(resp.request.headers.cookie, body, cb);
553+
const cookieData = parseCookie(resp.request.headers.cookie, cb);
557554
user.sessionId = cookieData.sessionId;
558555
user.sessionCSRF = cookieData.sessionCSRF;
559556
session.saveUser(user);

lib/plugins/solution.discuss.js

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
var request = require('request');
1+
const { request } = require('../util');
22

33
var log = require('../log');
44
var chalk = require('../chalk');
@@ -60,7 +60,7 @@ function getSolution(problem, lang, cb) {
6060
})
6161
}
6262
};
63-
request(opts, function(e, resp, body) {
63+
request.post(opts, function(e, resp, body) {
6464
if (e) return cb(e);
6565
if (resp.statusCode !== 200)
6666
return cb({msg: 'http error', statusCode: resp.statusCode});

0 commit comments

Comments
 (0)