How can I check what files are on a download section of a web site using c# windows app

时间:2022-08-26 22:00:47

We produce a device that is used in the safety industry and i have written the utility software to change user settings. There is an option to update the internal firmware and i would like to know the best way to access our company website download section and check if there is a new version of the firmware


The file format will be similar to "Application Firmware VX_XX.hex"

文件格式类似于“Application Firmware VX_XX.hex”

I can download a specific file using


WebClient myClient = new WebClient();
myClient.DownloadFile(strDownloadAddress, FileName);

But I would rather not loop through may possible files just to see if one exists.


I have tried some suggestions which are mainly the file system commands which dont seem to work on http.


I have no experience programming web apps/ services and Global IT have denied me access to the website to add services etc.


1 个解决方案



I'm just going to throw this out there as a pretty crude solution to your immediate issue, which I understand to be a need to retrieve a list of available update files from a web page. Hopefully it helps a bit.


As I don't have access to your company's updates page I'm going to use a page on Project Gutenburg which lists Children's Anthologies and grab all the links from that instead

由于我无法访问贵公司的更新页面,我将使用Project Gutenburg上的一个页面列出儿童的选集,并从中获取所有链接

The first thing to do is to grab the source code for the page into a string I'm calling source.


WebClient client = new WebClient();
String source = client.DownloadString(@"");

Then everything after that is just the same as programming for the desktop. source is just a piece of text to be parsed. In case you are unfamiliar, links in HTML are written <a href="linkgoeshere">textgoeshere</a> and because we're looking for that specific pattern we can just rip all the links out with Regex.

然后,之后的所有内容与桌面编程相同。 source只是一段要解析的文本。如果你不熟悉,HTML中的链接是 textgoeshere 写的,因为我们正在寻找特定的模式,我们可以用Regex撕掉所有链接。

Regex regex = new Regex("href=\"(http://[^\"]+)\"", RegexOptions.IgnoreCase | RegexOptions.Multiline );
var matches = regex.Matches(source);
foreach (Match match in matches)

That code will output all the links on the page. With a bit of luck the files on your update page links to will be direct links that end in .hex, which makes your Regex easier to target, and then you can sort them as required and use the latest link to grab the file as required.


EDIT: Incidentally, some links are written with <a href='linkgoeshere'>textgoeshere</a> with a ' rather than a ", also sometimes there is more info crammed in between the angle brackets, but your URL should always be directly after the href=["'] part.

编辑:顺便说一下,有些链接是用 textgoeshere 用'而不是'来编写的,有时候在尖括号之间还有更多的信息,但你的URL应该总是直接在href = [“']部分之后。



I'm just going to throw this out there as a pretty crude solution to your immediate issue, which I understand to be a need to retrieve a list of available update files from a web page. Hopefully it helps a bit.


As I don't have access to your company's updates page I'm going to use a page on Project Gutenburg which lists Children's Anthologies and grab all the links from that instead

由于我无法访问贵公司的更新页面,我将使用Project Gutenburg上的一个页面列出儿童的选集,并从中获取所有链接

The first thing to do is to grab the source code for the page into a string I'm calling source.


WebClient client = new WebClient();
String source = client.DownloadString(@"");

Then everything after that is just the same as programming for the desktop. source is just a piece of text to be parsed. In case you are unfamiliar, links in HTML are written <a href="linkgoeshere">textgoeshere</a> and because we're looking for that specific pattern we can just rip all the links out with Regex.

然后,之后的所有内容与桌面编程相同。 source只是一段要解析的文本。如果你不熟悉,HTML中的链接是 textgoeshere 写的,因为我们正在寻找特定的模式,我们可以用Regex撕掉所有链接。

Regex regex = new Regex("href=\"(http://[^\"]+)\"", RegexOptions.IgnoreCase | RegexOptions.Multiline );
var matches = regex.Matches(source);
foreach (Match match in matches)

That code will output all the links on the page. With a bit of luck the files on your update page links to will be direct links that end in .hex, which makes your Regex easier to target, and then you can sort them as required and use the latest link to grab the file as required.


EDIT: Incidentally, some links are written with <a href='linkgoeshere'>textgoeshere</a> with a ' rather than a ", also sometimes there is more info crammed in between the angle brackets, but your URL should always be directly after the href=["'] part.

编辑:顺便说一下,有些链接是用 textgoeshere 用'而不是'来编写的,有时候在尖括号之间还有更多的信息,但你的URL应该总是直接在href = [“']部分之后。