[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

microsoft.public.vb.general.discussion

Curious issue with Inet and a website

mm

5/30/2012 3:06:00 AM

Hello,

I made a program using the Internet Transfer Control that sends automatic
queries to a webpage. I compiled it because it works faster.
Ok, it worked, but after a number of queries the web server started to
respond "service unavailable, the user has reach the maximum number of
queries".

OK, but the strange thing is that in the IDE the server still responds.

I tried many things to find out how the server identifies the "user":

I changed my the IP of my internet connection.
I changed the exe name, product description, comments, etc. Also even the
exe path.
I found some code to change the User Agent (but I don't know if it actually
worked).
I run CCleaner.
I even rebooted the computer.

What can be the difference in the connection to the server that cause that
it identifies as the same user if the program is compiled but not if the
program runs in the IDE?

It's not a problem for me because it's just a one time work, and I still can
finish it running the program in the IDE (and there are few queries left),
but I'm curious about what is happening.



44 Answers

Jason Keats

5/30/2012 3:33:00 AM

0

On 30/05/2012 13:06, Eduardo wrote:
>
> I made a program using the Internet Transfer Control that sends automatic
> queries to a webpage. I compiled it because it works faster.
> Ok, it worked, but after a number of queries the web server started to
> respond "service unavailable, the user has reach the maximum number of
> queries".
>
> OK, but the strange thing is that in the IDE the server still responds.
>

Apparently you can't change the user-agent of the ITC...

http://stackoverflow.com/questions/8621264/how-to-change-the-useragent-of-in...

You could double-check that using Wireshark.

So, maybe it's just that your exe is connecting more quickly - so
quickly that it triggers protective behaviour on the server.

Or, could it be that it's punishment for not observing the rules in
their robots.txt file?

mm

5/30/2012 5:25:00 AM

0

"Jason Keats" <jkeats@melbpcDeleteThis.org.au> escribió en el mensaje
news:jq44cj$vtt$1@speranza.aioe.org...
> On 30/05/2012 13:06, Eduardo wrote:
>>
>> I made a program using the Internet Transfer Control that sends automatic
>> queries to a webpage. I compiled it because it works faster.
>> Ok, it worked, but after a number of queries the web server started to
>> respond "service unavailable, the user has reach the maximum number of
>> queries".
>>
>> OK, but the strange thing is that in the IDE the server still responds.
>>
>
> Apparently you can't change the user-agent of the ITC...
>
> http://stackoverflow.com/questions/8621264/how-to-change-the-useragent-of-in...

I read about that at a forum, but also found code to change it on another
forum.

But in that case when the Inet control is running in the IDE it should be
sending another user agent, I mean in the case that the "same user" is
identified by the user agent, but I find that hard to believe unless the
rule is only applied (by the server) for non standard user agents.

> You could double-check that using Wireshark.

I didn't check what is happening. I could perform a check with a test
program sending requests to a website of mine and see the logs.

> So, maybe it's just that your exe is connecting more quickly - so quickly
> that it triggers protective behaviour on the server.
>
> Or, could it be that it's punishment for not observing the rules in their
> robots.txt file?

No. It worked fine for about an hour, then the server begun to send access
denied (due to too many queries by the _same user_) responses.


Jason Keats

5/30/2012 7:29:00 AM

0

On 30/05/2012 15:25, Eduardo wrote:
>
> No. It worked fine for about an hour, then the server begun to send access
> denied (due to too many queries by the _same user_) responses.


As you know, you can't hide your router's IP address unless you use
something like Tor (the onion router) - so they're going to know your
router's IP.

So, your IP address and user agent, combined, are pretty good identifiers.

If I owned a website with a database full of intellectual property that
I didn't want stolen by a VB programmer using the ITC or MSXML, then I'd
probably limit access in some way - by throttling/banning a particular
IP, user-agent, or whatever.

I have no reason to believe that the internet transfer control sends
different user agent information in the IDE and exe. But who knows
without testing?

Good luck.

mm

5/30/2012 7:47:00 AM

0

"Jason Keats" <jkeats@melbpcDeleteThis.org.au> escribió en el mensaje
news:jq4i7k$ujs$1@speranza.aioe.org...
> On 30/05/2012 15:25, Eduardo wrote:
>>
>> No. It worked fine for about an hour, then the server begun to send
>> access
>> denied (due to too many queries by the _same user_) responses.
>
>
> As you know, you can't hide your router's IP address unless you use
> something like Tor (the onion router) - so they're going to know your
> router's IP.

> So, your IP address and user agent, combined, are pretty good identifiers.

I have dynamic IP ADSL so I can change my IP all the times that I want (I
just need to disconect and reconect), and I already did that (as I said
before).

> If I owned a website with a database full of intellectual property that I
> didn't want stolen by a VB programmer using the ITC or MSXML, then I'd
> probably limit access in some way - by throttling/banning a particular IP,
> user-agent, or whatever.

In this case it's a free tool for converting characters.

> I have no reason to believe that the internet transfer control sends
> different user agent information in the IDE and exe. But who knows without
> testing?

I don't know if it is the user agent, but something different it is sending,
because the server sees the IDE and the exe as a different users.

And once again, changing the IP does not change the situation.


Schmidt

5/30/2012 9:45:00 AM

0

Am 30.05.2012 09:47, schrieb Eduardo:

> I don't know if it is the user agent, but something
> different it is sending, because the server sees the
> IDE and the exe as a different users.

In case your test-machine is Vista or Win7, then you're
running your IDE under the Admin-Account probably...

And Windows has User-specific caches for cookies and
the like...

Simplest test is perhaps, to run also your compiled
Exe "as Admin" - and if the behaviour is then similar
to the VB-IDE, then the only remaining problem is,
to determine the appropriate (userdependent) "cookie-paths".

Olaf


mm

5/30/2012 11:16:00 AM

0

"Schmidt" <sss@online.de> escribió en el mensaje
news:jq4q7r$5ao$1@dont-email.me...
> Am 30.05.2012 09:47, schrieb Eduardo:
>
>> I don't know if it is the user agent, but something
>> different it is sending, because the server sees the
>> IDE and the exe as a different users.
>
> In case your test-machine is Vista or Win7, then you're
> running your IDE under the Admin-Account probably...
>
> And Windows has User-specific caches for cookies and
> the like...
>
> Simplest test is perhaps, to run also your compiled
> Exe "as Admin" - and if the behaviour is then similar
> to the VB-IDE, then the only remaining problem is,
> to determine the appropriate (userdependent) "cookie-paths".
>
> Olaf

Hello Schmidt,

I'm running it on XP.

I had run CCleaner to erase the cookies, and the "problem" still happened.


Mayayana

5/30/2012 12:48:00 PM

0

Here's a site to test IP and userAgent:

http://whatsmyuser...

The page loads showing your userAgent, so if you're
not displaying the page you could still parse the content
to get your userAgent. Or you can load a local page
with this code in it:

<SCRIPT LANGUAGE="VBScript">
document.write window.navigator.userAgent
</SCRIPT>

I was thinking the same as Olaf, that it might
be a cookies issue. I would make sure to remove
them through IE with the relevant user logged in,
rather than, with a system cleaner.

Aside from that there could be "super cookies",
including Flash cookies and global data storage
cookies. You need to disable script and Flash if
you really want to avoid being tagged by a website.
Online companies put a lot of effort into perfecting
ways to leave their brand on all visitors.

Flash cookies: Delete any folder named "Macromedia"
in AppData, All Users AppData, or Local AppData.
IE super cookies: Delete folder if it exists:
[AppData]\Microsoft\Internet Explorer\UserData
(There's also a setting for that, I think, but I have IE6
and that setting came later, so I don't know what the
name of it is.)

Another thought: Maybe there's code in the webpage
that might tell you something?

Or maybe the compiled version is breaking through some
kind of time limit measure server-side? (You said it works
faster compiled.)

I wondered, too, why you're sending multiple GETS
in series. That's usually a sign of badly written site
"scrapers"/download helpers, or of inconsiderate bot
operators. I don't have methods to deal with such
things automatically on my site, but I do put a number
of unique userAgent string snippets into my .htaccess
file to be blocked with a 403 message, because I've
recognized those userAgents to be associated with bad
behavior. I can't think of any valid reason to hit a site
with repeated GETs.

--
"Eduardo" <mm@mm.com> wrote in message
news:jq4vi0$3s6$1@speranza.aioe.org...
| "Schmidt" <sss@online.de> escribis en el mensaje
| news:jq4q7r$5ao$1@dont-email.me...
| > Am 30.05.2012 09:47, schrieb Eduardo:
| >
| >> I don't know if it is the user agent, but something
| >> different it is sending, because the server sees the
| >> IDE and the exe as a different users.
| >
| > In case your test-machine is Vista or Win7, then you're
| > running your IDE under the Admin-Account probably...
| >
| > And Windows has User-specific caches for cookies and
| > the like...
| >
| > Simplest test is perhaps, to run also your compiled
| > Exe "as Admin" - and if the behaviour is then similar
| > to the VB-IDE, then the only remaining problem is,
| > to determine the appropriate (userdependent) "cookie-paths".
| >
| > Olaf
|
| Hello Schmidt,
|
| I'm running it on XP.
|
| I had run CCleaner to erase the cookies, and the "problem" still happened.
|
|


mm

5/30/2012 4:02:00 PM

0

"Mayayana" <mayayana@invalid.nospam> escribió en el mensaje
news:jq54rp$vh2$1@dont-email.me...
> Here's a site to test IP and userAgent:
>
> http://whatsmyuser...
>
> The page loads showing your userAgent, so if you're
> not displaying the page you could still parse the content
> to get your userAgent. Or you can load a local page
> with this code in it:

I just tested it:

IDE: Microsoft URL Control - 6.01.9782
EXE: Microsoft URL Control - 6.01.9782

The same.

> <SCRIPT LANGUAGE="VBScript">
> document.write window.navigator.userAgent
> </SCRIPT>
>
> I was thinking the same as Olaf, that it might
> be a cookies issue. I would make sure to remove
> them through IE with the relevant user logged in,
> rather than, with a system cleaner.

But... we are talking about the "Internet Transfer Control", not the
Webbrowser.
Anyway, when I didn't know what else to try, I had run CClener.
I'm on XP, so there is not much issue of rights (and I'm as admin, the
default).

> Aside from that there could be "super cookies",
> including Flash cookies and global data storage
> cookies. You need to disable script and Flash if
> you really want to avoid being tagged by a website.
> Online companies put a lot of effort into perfecting
> ways to leave their brand on all visitors.

Yes, I know, but again, it was the Internet Transfer Control. And the page
was in fact a service to query, that returned one line with about ten plain
text words.

If you want to test it I can provide the URL of the page.
But it took about 100,000 queries to get the error message. So it can take a
while to test.

>
> Flash cookies: Delete any folder named "Macromedia"
> in AppData, All Users AppData, or Local AppData.
> IE super cookies: Delete folder if it exists:
> [AppData]\Microsoft\Internet Explorer\UserData>
> (There's also a setting for that, I think, but I have IE6
> and that setting came later, so I don't know what the
> name of it is.)
>
> Another thought: Maybe there's code in the webpage
> that might tell you something?

I didn't save it. But it was a simple page saying something like "access
denied, the user has reach the limit of queries" and may be there was an
error number. By the way, after some hours it's working now (no error now,
it allows "the user" to query again)

How did I read the page? The exe started to stop with an error.
Then I run it in the IDE, and it worked...
Tried the exe... error.
Went to the IDE again, worked.
Compiled again.
Run the exe, error.
I put some messageboxes to see where it stopped. Compiled and run the exe.
Found the place, it was in the routine that interprets the HTML result.
Put there a Clipboard.Clear and Clipboard.SetText TheHTML

I pasted that to Notepad and read the message.

>
> Or maybe the compiled version is breaking through some
> kind of time limit measure server-side? (You said it works
> faster compiled.)

But it worked fine for about an hour...
But... may be. Because I was using 10 Inets in an array, for querying in
parallel.

And I don't know if the error came in the first query.
May be the server slowed down the query interval that was allowed, and in
the IDE it still worked because it run slower (it was about 10 times slower
in the IDE).

> I wondered, too, why you're sending multiple GETS
> in series. That's usually a sign of badly written site
> "scrapers"/download helpers, or of inconsiderate bot
> operators.

And how do you do if you need to get about 114,000 results to be stored in a
database?

> I don't have methods to deal with such
> things automatically on my site, but I do put a number
> of unique userAgent string snippets into my .htaccess
> file to be blocked with a 403 message, because I've
> recognized those userAgents to be associated with bad
> behavior. I can't think of any valid reason to hit a site
> with repeated GETs.

If you want I can tell you what I'm doing.


Dee Earley

5/30/2012 4:26:00 PM

0

On 30/05/2012 17:02, Eduardo wrote:
> And how do you do if you need to get about 114,000 results to be stored in a
> database?

You ask them very nicely.

As you've found out, sites tend to have limits in place. especially for
that large an amount of data.

--
Deanna Earley (dee.earley@icode.co.uk)
i-Catcher Development Team
http://www.icode.co.uk...

iCode Systems

(Replies direct to my email address will be ignored.
Please reply to the group.)

mm

5/30/2012 4:47:00 PM

0

"Deanna Earley" <dee.earley@icode.co.uk> escribió en el mensaje
news:jq5hmq$oh8$1@speranza.aioe.org...
> On 30/05/2012 17:02, Eduardo wrote:
>> And how do you do if you need to get about 114,000 results to be stored
>> in a
>> database?
>
> You ask them very nicely.

How to ask (for example) to Google, to translate for you 114,000 spanish
words to English?

> As you've found out, sites tend to have limits in place. especially for
> that large an amount of data.

I new that from long ago. But not every server has limits anyway.