Github Com Moe82 Seo Tools

SEO Tools is a collaborative project that provides the tools for web developers to check for meta tags, broken links, and more. The goal of this project is to provide the tools and resources needed to keep your website up-to-date with search engine optimization.

Github Com Moe82 Seo Tools

6 SEO Tasks to Automate with Python

SEO-Tools
A collection of SEO automation tools that I wrote in Python.

Webmaster
The second script in my Python SEO automation pipeline. Transforms a list of niche related keywords into list of email addresses (along with the first name, last name, and domain authority) of webmasters within a niche.

Do you want to boost your website’s traffic?

Take advantage of FLUX DIGITAL RESOURCE seo tools

The script is fairly simple and merely ties 3 different API’s together, but the output is quite significant. With enough keywords (and time), it can generate thousands of high quality email addresses for webmasters within a specific niche. Collecting this information manually would be nearly impossible.

The output (a .csv file) can be fed directly into your favorite email marketing software and the keywords can simply be those that your site ranks for.

Requirements
Python 3
Google Library (Note: package has no association with Google, LLC)
Free Moz API Key
Free Hunter.io API
Instructions
Clone this repository and install the Google library.
git clone https://github.com/Moe82/SEO-Tools.git
pip install google
Open up keywords.txt (located inside the data directory) and populate it with keywords related to your niche. Each keyword should go on a separate line. Tip: you can download a very large file of keywords that you rank for on search.google.com
Run the script.
python3 webmaster.py
When prompted, enter your Moz API access ID and key and your Hunter.io API Key. Alternatively, you can have the script load your credentials automatically. To do so, paste the text below into a .csv file, save it as credentials.csv, and place it inside the data directory. If a file called credentials.csv is already present inside the data directory, delete the old one and insert the new one that you created.
MOZ_ACCESS_ID,
MOZ_SECRET_KEY,
HUNTER_API_KEY,
As the script runs, the 3 CSV below will be updated as information is collected. Each row in a table represents a contact and each contact includes a first name, last name, email address and its confidence score (as determined by the Hunter.io API), organization, domain, and domain authority. Note that if information for any of these fields is not found, the corresponding cell will be empty.

personal_contact.csv – Email address that are flagged as “personal” and have an associated first name and last name. This is what you want to use for your outreach.
personal_contact_extra.csv – Email address that are flagged as “personal” but don’t have an accociated first name and last name. You can use this list as well, but I find that starting an email with “Dear first_name last_name,” greatly increases response rate.
nonpersonal_contact.csv – All other email addresses (that are not flagged as personal).
history.txt – Contains the domain address for sites that have already been scrapped. This file essentially allows you to terminate the script at any point and resume where you left off when you run it again.
By default, sites with a domain authority score greater than 70 are ignored as such sites generally aren’t good targets for backlink accusation. This value can be changed in line 11 of webmaster.py.

Common HTTP response errors
HTTP Error 429 – Too many calls to API. Either upgrade free trial account or make a new one.
HTTP error 401 – Incorrect api credentials. Check that you have entered them in correctly and try again.

python seo analyzer

Python is all about automating repetitive tasks, leaving more time for your other Search Engine Optimization (SEO) efforts. Not many SEOs use Python for their problem-solving, even though it could save you a lot of time and effort. Python, for example, can be used for the following tasks:

  • Data extraction
  • Preparation
  • Analysis & visualization
  • Machine learning
  • Deep learning

We’ll be focussing mostly on data extraction and analysis in this article. The required modules will be indicated for each script.

Python SEO analyzer

A really useful script for analyzing your website is called ‘SEO analyzer’. It’s an all round website crawler that analyses the following information:

  • Word count
  • Page Title
  • Meta Description
  • Keywords on-page
  • Warnings
  • Missing title
  • Missing description
  • Missing image alt-text

This is great for a quick analysis of your basic SEO problems. As page title, meta descriptions and on-page keywords are important ranking factors, this script is perfect for gaining a clear picture of any problems that might be in play.

Using the SEO analyzer

After having installed the necessary modules (BeautifulSoup 4 + urllib2) for this script and having updated your Python to version 3.4+, you are technically ready to use this script. Json or working variants, however, can be useful for exporting the data you gain from the SEO analyser. After having installed the script, these are the commands you can use:

seoanalyze http://internetvergelijnk.nl/

seoanalyze https://telefoonvergelijk.nl --sitemap https://telefoonvergelijk.nl/sitemap_index.xml

As seen in the examples above, for both internetvergelijk and telefoonvergelijk , it’s possible to either crawl the website, or the XML sitemap of a website in order to do an SEO analysis. Another option is to generate HTML output from the analysis instead of using json. This can be done through the following command:

seoanalyze http://internetvergelijk.nl/ --output-format-html

If you have installed json and want to export the data, use the following command:

from seoanalyzer import analyse output = analyse(site, sitemap) print(output)

You can also choose for an alternative path, running the analysis as a script, as seen in the example below:

This will export the file into a html after having run the –output-format html script. This seoanalyze script is great for optimizing your page titles, meta descriptions, images and on-page keywords. It’s also a lot faster than Screaming Frog, so if you’re only looking for this information, running the seoanalyze script is more efficient.

Link status analyser

Another way to use Python for Search Engine Optimization is by using a script that crawls your website and analyses your URL status codes. This script is called Pylinkvalidator and can be found here). All it requires is BeautifulSoup if you’re running it with Python 3.x. If you’re running a 2.x version like 2.6 or 2.7, you should not need BeautifulSoup.

In order to speed up the crawling, however, it might be useful to install the following libraries:

1) lxml – Speeds up the crawling of HTML pages (requires C libraries) 1) gevent – enables pylinkvalidator to use green threads 1) cchardet – Speeds up document encoding detection

Do keep this in mind, they could be very useful for crawling larger websites, and just to enhance the link status analyser.

What this script essentially does, is crawl the entire URL structure of a website in order to analyse the status codes of each and every URL. This makes it a very long process for bigger websites, hence the recommendation of using the optional libraries to speed this up.

Using the link status analyser

Pylinkvalidator has a ton of different usage options for usage. For example, you can:

  • Show progress
  • Crawl the website and pages belonging to another host
  • Only crawl a single page and the pages it links to
  • Only crawl links, ignore others (images, stylesheets, etc.)
  • Crawl a website with more threads or processes than default
  • Change your user agent
  • Crawl multiple websites
  • Check robots.txt
  • Crawling body tags and paragraph tags

Showing progress through -P or --progress is recommended, as without it, you will find yourself wondering when your crawl will be done without any visual signs. The command for crawling more threads (-- workers='number of workers') and processes (-- mode=process --workers='number of workers') can be very useful as well.

Of course, the script has many more options to explore. The examples below show some of the possible uses:

pylinkvalidate.py -p http://www.example.com/

The function above crawls the website and shows progress.

pylinkvalidate.py -p workers=4 http://www.example.com/

This function crawls a website with multiple threads and shows progress.

pylinkvalidate.py -p --parser=lxml http://www.example.com/

This function uses the lxml library in order to speed up the crawl while showing progress.

pylinkvalidate.py -P --types=a http://www.example.com/

The function above only crawls links (<a href>) on your website, ignoring images, scripts, stylesheets and any other non-link attribute on your website. This is also a useful function when crawling the URLs of large websites. After the script has run its course, you’ll get a list of URLs with status codes 4xx and 5xx that have been found by crawling your website. Along with that, you’ll gain a list of URLs that link to that page, so you’ll have an easier time fixing the broken links. The regular crawl does not show any 3xx status codes. For more detailed information about what URLs your pages can be reached from, try the following function:

pylinkvalidate.py --report-type=all http://www.example.com/

This give information about the status code of a page, and all the other pages that link to a page.

An incredibly useful SEO tool you can use to crawl your website for broken links (404) and server errors. Both of these errors can be bad for your SEO efforts, so be sure to regularly crawl your own website in order to fix these errors ASAP.

Conclusion

While these scripts are incredibly useful, there are a lot of various uses for Python in the world of SEO. Challenge yourself to create scripts that make your SEO efforts more efficient. There are plenty of Python scripts to make your life easier. There’s scripts for checking your hreflang tags, canonicals, robots.txt and much more. Because who, in today’s day and age, still does stuff manually when it can be automated?

Conclusion

Let us know your thoughts in the comment section below.

Check out other publications to gain access to more digital resources if you are just starting out with Flux Resource.
Also contact us today to optimize your business(s)/Brand(s) for Search Engines

Leave a Reply