Skip to content

Commit 210bcd5

Browse files
committed
Updated WebPaser Tutorial
1 parent a9e9c49 commit 210bcd5

File tree

5 files changed

+93
-80
lines changed

5 files changed

+93
-80
lines changed
9.59 KB
Loading
-8.48 KB
Loading
-15.4 KB
Loading
-1.42 KB
Loading

source/tips/webparser-tutorial.html

Lines changed: 93 additions & 80 deletions
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@ <h3>A quick example:</h3>
8181

8282
<h2>Our tutorial skin</h2>
8383

84-
<p>So what we are going to do today is parse a website <a href="https://www.ipaddress.my/" target="blank">https://www.ipaddress.my/</a> to get our IP and location information to use in a skin. Click that link now if you want to open the page in a new tab/window.</p>
84+
<p>So what we are going to do today is parse a website <a href="https://browserleaks.com/ip" target="blank">https://browserleaks.com/ip</a> to get our IP and location information to use in a skin. Click that link now if you want to open the page in a new tab/window.</p>
8585

8686
<p>Here is what the web page we are going to parse looks like, with the information we are going to extract numbered:</p>
8787
<div class="exampleprev">
@@ -107,7 +107,7 @@ <h2>The parent Webparser measure</h2>
107107

108108
[MeasureSite]
109109
Measure=WebParser
110-
URL=https://www.ipaddress.my/
110+
URL=https://browserleaks.com/ip
111111
RegExp=
112112
Debug=2
113113

@@ -122,73 +122,65 @@ <h2>The parent Webparser measure</h2>
122122

123123
<p>Now let's look at the first bit of information we want to retrieve.</p>
124124

125-
<p>First we will get the flag image that is displayed centered near the top of the page.</p>
125+
<p>First we will get the local IP address returned for our PC.</p>
126126

127-
<p>Open up WebParserDump.txt (the saved output from the website) and search for the location where the site is storing these flag images:</p>
127+
<p>Open up WebParserDump.txt (the saved output from the website) and search for the location where the site is displaying the IP Address data:</p>
128128

129129
``` html
130-
<img src="/images/flags/us.png" border="0" height="48" width="48" alt="United States of America flag" align="absMiddle" /> </li>
130+
<tr><td>IP Address</td><td><span class="flag-container" id="client-ipv4" data-ip="68.100.86.32"
131131
```
132132

133-
<p>If we search for <code>/images/flags/</code>, it will take us to the right place. What we want is the name of the file that follows <code>/images/flags/</code>. In this example, <code>us.png</code>.</p>
133+
If we search for <code>data-ip=</code>, it will take us to the right place. What we want is the IP Address that follows, enclosed in quotes. In this example, <code>68.100.86.32</code>.</p>
134134

135135
<p>Let's start building our <code>RegExp</code> option:</p>
136136

137137
``` html
138138
[MeasureSite]
139139
Measure=WebParser
140-
URL=https://www.ipaddress.my/
141-
RegExp=(?siU)<img src="/images/flags/(.*)"
140+
URL=https://browserleaks.com/ip
141+
RegExp=(?siU)data-ip="(.*)"
142142
UpdateRate=3600
143143
```
144144

145145
<p>The important options are:</p>
146146

147-
<p><b>URL=https://www.ipaddress.my/</b> - The URL to the website. It can be set as a variable in the [Variables] section to make it easier to find and change if you want.</p>
147+
<p><b>URL=https://browserleaks.com/ip</b> - The URL to the website. It can be set as a variable in the [Variables] section to make it easier to find and change if you want.</p>
148148

149-
<p><b>RegExp=(?siU)&ltimg src="/images/flags/(.*)"</b> - Ah, the meat and potatoes...</p>
149+
<p><b>RegExp=(?siU)data-ip="(.*)"</b> - Ah, the meat and potatoes...</p>
150150

151151
<p>You are telling RegExp to:</p>
152152

153-
<p>Use the (?siU) expression directives, (described earlier) search for <code>&ltimg src="/images/flags/</code> and capture everything to a StringIndex <code>(.*)</code> until it sees <code>"</code>, where it will stop.</p>
153+
<p>Use the (?siU) expression directives, (described earlier) search for <code>data-ip="</code> and capture everything into a StringIndex <code>(.*)</code> until it sees <code>"</code>, where it will stop.</p>
154154

155155
<p>So if we look again at our output in WebParserDump.txt </p>
156156

157157
``` html
158-
<img src="/images/flags/us.png" border="0" height="48" width="48" alt="United States of America flag" align="absMiddle" /> </li>
158+
<tr><td>IP Address</td><td><span class="flag-container" id="client-ipv4" data-ip="68.100.86.32"
159159
```
160160

161-
<p>You can see that we will return <code>us.png</code> in StringIndex 1</p>
161+
<p>You can see that we will return <code>68.100.86.32</code> in StringIndex 1</p>
162162

163163
<p><b>UpdateRate=3600</b> - We want to check the website at a rate 3600 times the value in the "Update=" parameter in the "Rainmeter" section. As this defaults to "1000" or once every 1000 milliseconds (1 second) we will be running WebParser every 3600 seconds or 60 minutes. This is plenty often, as your IP information doesn't change much and you don't want to "spam" the website with requests. You may well find yourself blocked...</p>
164164

165165
<h2>The first child measure</h2>
166166

167167
<p>Then, we build a "child" measure, to grab that information from StringIndex 1 of the "parent" measure.</p>
168168

169-
<p>What we really want is not the file name of the flag image, but the image itself. WebParser can easily do this.</p>
170-
171-
<p>As long as the value from the parent measure that is returned in the StringIndex number is the URL to an image file, you can simply add the <code>Download=1</code> option to the child measure, and the image will be downloaded. The value of the child measure will then be the full <em>local</em> path to the file in the Windows TEMP folder Rainmeter will create.</p>
172-
173-
<p>However, wait.. the value "us.png" that is returned is not a full URL to the image. There is also a "relative path" to the image on the remote server. That's ok, we can still get the image.</p>
174-
175-
<h3>Creating the child measure for the the first StringIndex, the flag image value</h3>
169+
<h3>Creating the child measure for the the first StringIndex, the IP Address value</h3>
176170

177171
``` ini
178-
[MeasureFlagImage]
179-
Measure=WebParser
180-
URL=https://www.ipaddress.my/images/flags/[MeasureSite]
172+
[MeasureIP]
173+
Measure=Plugin
174+
Plugin=WebParser
175+
URL=[MeasureSite]
181176
StringIndex=1
182-
Download=1
183177
```
184178

185-
<p>What we are doing is appending the first part of the URL, the one we used on the parent measure to the beginning of the <code>URL</code> option, followed by the relative path that flag images are stored in on the site, which is <code>/images/flags/</code> and a reference to [MeasureSite] and the <code>StringIndex=2</code> option. Then we add <code>Download=1</code> and that full URL of <code>https://www.ipaddress.my/images/flags/us.png</code> will be used to retrieve the image file.</p>
186-
187179
<p>The value of the child measure will in my case be:</p>
188180

189-
<p><code>C:\Users\Jeffrey\AppData\Local\Temp\Rainmeter-Cache\us.png</code></p>
181+
<p><code>68.100.86.32</code></p>
190182

191-
<p>Which we can use in an Image meter later to display it.</p>
183+
<p>Which we can use in a String meter later to display it.</p>
192184

193185
<h3>Testing as we go</h3>
194186

@@ -200,75 +192,72 @@ <h3>Testing as we go</h3>
200192

201193
<h2>The second child measure</h2>
202194

203-
<p>Now let's get the next bit of information we want from the website. (Remember, the RegExp reads the website in order from top to bottom, so you need to use the correct order in the "RegExp=" statement. You can display the information in any order you want on your skin however.)</p>
195+
<p>Now let's get the next bit of information we want from the website. Remember, the RegExp reads the website in order from top to bottom, so you need to use the correct order in the "RegExp=" statement. You can display the information in any order you want on your skin however.</p>
204196

205-
<p>The next information in the WebParserDump file that we want is the IP address for your computer that the site detects:</p>
197+
<p>The next information in the WebParserDump file that we want is the flag image for your detected country:</p>
206198

207199
``` html
208-
<tr>
209-
<td width="40%">IP Address:</td>
210-
<td width="60%"><a href="https://www.ip2location.com/demo/68.100.86.32" target="_blank">68.100.86.32</a></td>
211-
</tr>
200+
<tr><td>Country</td><td><span class="flag-container" id="lookup-flag" data-iso_code="US"><img class="flag-icon" src="/img/flags/US.png"
212201
```
202+
<p>What we really want is not the file name of the flag image, but the image itself. WebParser can easily do this.</p>
213203

214-
<p>So we want to add to our "RegExp=" statement, search for th IP address, and return the result in the next StringIndex on the parent measure:</p>
204+
<p>As long as the value from the parent measure that is returned in the StringIndex number is the URL to an image file, you can simply add the Download=1 option to the child measure, and the image will be downloaded. The value of the child measure will then be the full local path to the file in the Windows TEMP folder Rainmeter will create.</p>
215205

216-
``` ini
206+
We will modify our RegExp statement to get the flag data into a second StringIndex:
207+
208+
```ini
217209
[MeasureSite]
218-
Measure=WebParser
219-
URL=https://www.ipaddress.my/
220-
RegExp=(?siU)<img src="/images/flags/(.*)".*<td width="40%">IP Address:</td>.*target="_blank">(.*)<
210+
Measure=Plugin
211+
Plugin=WebParser
212+
URL=https://browserleaks.com/ip
213+
RegExp=(?siU)data-ip="(.*)".*<tr><td>Country</td>.*src="(.*)"
221214
UpdateRate=3600
222215
```
223216

224-
<p>So after the first pair of start/stop searches we already did, we are adding:</p>
225-
226-
<p><code>.*&lttd width="40%"&gtIP Address:&lt/td&gt.*target="_blank"&gt(.*)&lt</code></p>
217+
<p>However, wait.. the value <b>/img/flags/US.png</b> that is returned is not a full URL to the image. There is also a "relative path" to the image on the remote server. That's ok, we can still get the image.
227218

228-
<p>This will tell RegExp to skip everything until it finds <code>&lttd width="40%"&gtIP Address:&lt/td&gt.*target="_blank"&gt</code> and then capture everything until it sees <code>&lt</code> and put it in <code>StringIndex=2</code>. The result in my example will be <code>68.100.86.32</code>.</p>
229-
230-
<h3>Creating the child measure for the the second StringIndex, the IP address value</h3>
219+
Creating the child measure for the the second StringIndex, the flag image value
231220

232221
``` ini
233-
[MeasureIP]
234-
Measure=WebParser
235-
URL=[MeasureSite]
236-
StringIndex=2
222+
[MeasureFlagImage]
223+
Measure=Plugin
224+
Plugin=WebParser
225+
URL=https://browserleaks.com[MeasureSite]
226+
StringIndex=2
227+
Download=1
237228
```
238229

230+
What we are doing is appending the first part of the URL, the one we used on the parent measure to the beginning of the URL option, followed by the relative path that flag images are stored in on the site, which is /img/flags/ and a reference to [MeasureSite] and the StringIndex=2 option. Then we add Download=1 and that full URL of https://browserleaks.com/img/flags/US.png will be used to retrieve the image file.
231+
232+
The value of the child measure will in my case be:
233+
234+
C:\Users\Jeffrey\AppData\Local\Temp\Rainmeter-Cache\us.png
235+
239236
<h2>The third child measure</h2>
240237

241-
<p>Now we want to get the name of the "City":</p>
238+
<p>Now we want to get the name of the "Country":</p>
242239

243240
``` html
244-
<tr>
245-
<td>City:</td>
246-
<td>Fairfax</td>
247-
</tr>
241+
title="United States (US)"><span class="flag-text wball">United States <
248242
```
249243

250-
<p>So we want to add to our "RegExp=" statement, search for the city name, and return the result in the next StringIndex on the parent measure:</p>
244+
<p>So we want to add to our "RegExp=" statement, search for the Country name, and return the result in the next StringIndex on the parent measure:</p>
251245

252246
``` ini
253247
[MeasureSite]
254248
Measure=WebParser
255-
URL=https://www.geoiptool.com/
256-
RegExp=(?siU)<img src="/images/flags/(.*)".*<td width="40%">IP Address:</td>.*target="_blank">(.*)<.*<td>City:</td>.*<td>(.*)</td>
249+
URL=https://browserleaks.com/ip
250+
RegExp=(?siU)data-ip="(.*)".*<tr><td>Country</td>.*src="(.*)".*<span class="flag-text wball">(.*)<
257251
```
258252

259-
<p>So after the start/stop searches we already had on the RegExp option, we are adding:</p>
260-
261-
<p><code>.*&lttd&gtCity:&lt/td&gt.*&lttd&gt(.*)&lt/td&gt</code></p>
262-
263-
<p>This will tell RegExp to skip everything until it sees <code>&lttd&gtCity:&lt/td&gt.*&lttd&gt</code> then capture everything until it sees <code>&lt/td&gt</code> and put it in <code>StringIndex=3</code>. The result in my example will be <code>Fairfax</code>.</p>
264-
265253
<h3>Creating the child measure for the the third StringIndex, the city name value</h3>
266254

267255
``` ini
268-
[MeasureCity]
269-
Measure=WebParser
270-
URL=[MeasureSite]
271-
StringIndex=3
256+
[MeasureCountryName]
257+
Measure=Plugin
258+
Plugin=WebParser
259+
URL=[MeasureSite]
260+
StringIndex=3
272261
```
273262

274263
<h3>And so on...</h3>
@@ -277,33 +266,57 @@ <h3>And so on...</h3>
277266

278267
``` ini
279268
[MeasureSite]
280-
Measure=WebParser
281-
URL=https://www.ipaddress.my/
282-
RegExp=(?siU)<img src="/images/flags/(.*)".*<td width="40%">IP Address:</td>.*target="_blank">(.*)<.*<td>City:</td>.*<td>(.*)</td>.*<td>Country:</td>.*target="_blank">(.*)<.*<td>State:</td>.*<td>(.*)</td>.*<td>Latitude:</td>.*<td>(.*)</td>.*<td>Longitude:</td>.*<td>(.*)</td>
269+
Measure=Plugin
270+
Plugin=WebParser
271+
URL=https://browserleaks.com/ip
272+
RegExp=(?siU)data-ip="(.*)".*<tr><td>Country</td>.*src="(.*)".*<span class="flag-text wball">(.*)<.*<tr><td>State/Region</td><td>(.*)<.*<tr><td>City</td><td>(.*)<.*<span id="coords-click">Coordinates</span>.*data-lat="(.*)".*data-lon="(.*)"
283273
UpdateRate=3600
284274
```
285275

286-
<p>and the rest of the child measures:</p>
276+
<p>and all of the child measures:</p>
287277

288278

289279
``` ini
280+
[MeasureIP]
281+
Measure=Plugin
282+
Plugin=WebParser
283+
URL=[MeasureSite]
284+
StringIndex=1
285+
286+
[MeasureFlagImage]
287+
Measure=Plugin
288+
Plugin=WebParser
289+
URL=https://browserleaks.com[MeasureSite]
290+
StringIndex=2
291+
Download=1
292+
290293
[MeasureCountryName]
291-
Measure=WebParser
294+
Measure=Plugin
295+
Plugin=WebParser
292296
URL=[MeasureSite]
293-
StringIndex=4
297+
StringIndex=3
294298

295299
[MeasureRegion]
296-
Measure=WebParser
300+
Measure=Plugin
301+
Plugin=WebParser
302+
URL=[MeasureSite]
303+
StringIndex=4
304+
305+
[MeasureCity]
306+
Measure=Plugin
307+
Plugin=WebParser
297308
URL=[MeasureSite]
298309
StringIndex=5
299310

300311
[MeasureLatitude]
301-
Measure=WebParser
312+
Measure=Plugin
313+
Plugin=WebParser
302314
URL=[MeasureSite]
303315
StringIndex=6
304316

305317
[MeasureLongitude]
306-
Measure=WebParser
318+
Measure=Plugin
319+
Plugin=WebParser
307320
URL=[MeasureSite]
308321
StringIndex=7
309322
```
@@ -371,16 +384,16 @@ <h2>The meters</h2>
371384
[MeterFlagImage]
372385
Meter=Image
373386
MeasureName=MeasureFlagImage
374-
X=90r
375-
Y=-1r
376-
H=30
387+
X=70r
388+
Y=4r
389+
H=20
377390
W=30
378391

379392
[MeterCountryName]
380393
Meter=String
381394
MeasureName=MeasureCountryName
382395
X=315
383-
Y=3r
396+
Y=-2r
384397
FontSize=11
385398
FontColor=252,251,202,255
386399
SolidColor=0,0,0,1

0 commit comments

Comments
 (0)