Downloading A Web Page Using Python
In this tutorial I am going to give you a gentle introduction to network programming in Python. If you are new to programming or new to Python then that may seem like a daunting thought. But read on and you will be pleasantly surprised how easy it is.
Like most modern programming languages, Python was designed for networking from the very beginning, and thanks to that, a lot of the networking tasks you would want to accomplish with the language are made a whole lot easier.
Network communication is a large topic, but if it is something that interests you then read on because in this tutorial I will show you how to download a web page. I will show you how easy Python makes tasks like this.
Take a look at the following code:
import urllib con = urllib.urlopen("http://hartmannsoftware.com") page = con.read() con.close() print page
Yes, you’re eyes are just fine. That really is only five lines of code to perform such a powerful operation. So what does it all mean?
The first line of code imports the urllib module for us to use. This module contains various networking functions we can use to perform network based operations such as connecting to a server and receiving data.
On the second line we call the urlopen function of the urllib module and give it the address of the page we want to download. In this case I’ve used Slashdot but you can easily replace that with any other address. We assign the result of the urlopen function to our variable named con, which is a connection object.
Next up on the third line of code we create a variable called page and assign it the results of our connection objects read function. In this case the result will be all the html and text and found on the web page.
The fourth line is simple enough to understand. All it does is close the connection so we can’t send and receive any more data.
And lastly we use the print statement to output all the data we received, which will basically be every piece of text, html, javascript and css which makes up the web page.
So there, you have it. In just five lines of code you were able to connect to a server and then download a web page from it. Trying to do the same thing in other languages can be a rather long winded experience often requiring you to have a good knowledge of how sockets work.
Now, I will be the first to admit that downloading a web page isn’t exactly the most exciting thing you can do in Python, but it gives you a taste of the kind of power which is built into the language and the sheer simplicity of it is amazing.
So I hope you enjoyed this article and if you are interested in learning more then pay us a visit again for more tutorials to help you learn.
other blog entries
Course Directory [training on all levels]
- .NET Classes
- Agile/Scrum Classes
- Ajax Classes
- Android and iPhone Programming Classes
- Blaze Advisor Classes
- C Programming Classes
- C# Programming Classes
- C++ Programming Classes
- Cisco Classes
- Cloud Classes
- CompTIA Classes
- Crystal Reports Classes
- Design Patterns Classes
- DevOps Classes
- Foundations of Web Design & Web Authoring Classes
- Git, Jira, Wicket, Gradle, Tableau Classes
- IBM Classes
- Java Programming Classes
- JBoss Administration Classes
- JUnit, TDD, CPTC, Web Penetration Classes
- Linux Unix Classes
- Machine Learning Classes
- Microsoft Classes
- Microsoft Development Classes
- Microsoft SQL Server Classes
- Microsoft Team Foundation Server Classes
- Microsoft Windows Server Classes
- Oracle, MySQL, Cassandra, Hadoop Database Classes
- Perl Programming Classes
- Python Programming Classes
- Ruby Programming Classes
- Security Classes
- SharePoint Classes
- SOA Classes
- Tcl, Awk, Bash, Shell Classes
- UML Classes
- VMWare Classes
- Web Development Classes
- Web Services Classes
- Weblogic Administration Classes
- XML Classes
- Introduction to Spring 5 (2022)
16 December, 2024 - 18 December, 2024 - Introduction to C++ for Absolute Beginners
16 December, 2024 - 17 December, 2024 - VMware vSphere 8.0 with ESXi and vCenter
9 December, 2024 - 13 December, 2024 - Fast Track to Java 17 and OO Development
9 December, 2024 - 13 December, 2024 - VMware vSphere 8.0 Boot Camp
9 December, 2024 - 13 December, 2024 - See our complete public course listing
did you know? HSG is one of the foremost training companies in the United States
Our courses focus on two areas: the most current and critical object-oriented and component based tools, technologies and languages; and the fundamentals of effective development methodology. Our programs are designed to deliver technology essentials while improving development staff productivity.
An experienced trainer and faculty member will identify the client's individual training requirements, then adapt and tailor the course appropriately. Our custom training solutions reduce time, risk and cost while keeping development teams motivated. The Hartmann Software Group's faculty consists of veteran software engineers, some of whom currently teach at several Colorado Universities. Our faculty's wealth of knowledge combined with their continued real world consulting experience enables us to produce more effective training programs to ensure our clients receive the highest quality and most relevant instruction available. Instruction is available at client locations or at various training facilities located in the metropolitan Denver area.
Upcoming Classes
- Introduction to Spring 5 (2022)
16 December, 2024 - 18 December, 2024 - Introduction to C++ for Absolute Beginners
16 December, 2024 - 17 December, 2024 - VMware vSphere 8.0 with ESXi and vCenter
9 December, 2024 - 13 December, 2024 - Fast Track to Java 17 and OO Development
9 December, 2024 - 13 December, 2024 - VMware vSphere 8.0 Boot Camp
9 December, 2024 - 13 December, 2024 - See our complete public course listing
consulting services we do what we know ... write software
The coaching program integrates our course instruction with hands on software development practices. By employing XP (Extreme Programming) techniques, we teach students as follows:
Configure and integrate the needed development tools
MOntitor each students progress and offer feedback, perspective and alternatives when needed.
Establish an Action plan to yield a set of deliverables in order to guarantee productive learning.
Establish an Commit to a deliverable time line.
Hold each student accountable to a standard that is comparable to that of an engineer/project manager with at least one year's experience in the field.
These coaching cycles typically last 2-4 weeks in duration.
Business Rule isolation and integration for large scale systems using Blaze Advisor
Develop Java, .NET, Perl, Python, TCL and C++ related technologies for Web, Telephony, Transactional i.e. financial and a variety of other considerations.
Windows and Unix/Linux System Administration.
Application Server Administration, in particular, Weblogic, Oracle and JBoss.
Desperate application communication by way of Web Services (SOAP & Restful), RMI, EJBs, Sockets, HTTP, FTP and a number of other protocols.
Graphics Rich application development work i.e. fat clients and/or Web Clients to include graphic design
Performance improvement through code rewrites, code interpreter enhancements, inline and native code compilations and system alterations.
Mentoring of IT and Business Teams for quick and guaranteed expertise transfer.
Architect both small and large software development systems to include: Data Dictionaries, UML Diagrams, Software & Systems Selections and more