Use Prerender to improve AngularJS SEO
Nuget Package of ASP.NET MVC HttpModule for prerender.io: Install-Package DotNetOpen.PrerenderModule
Source Code of ASP.NET MVC and ASP.NET Core Middlewares for prerender.io: https://github.com/dingyuliang/prerender-dotnet (ASP.NET Core middleware is still in progress, not check in yet.)
There are a lot of good JavaScript framework (i.e. AngularJS, BackboneJS, ReactJS) which have been released recent years, and they become more and more popular. Many companies and developers are using them to develop applications. There are a lot of advantages we use these frameworks:
- Separate frontend development and backend development.
- JavaScript Framework + Restful API (Or Microservice architecture) is very flexible and easy to maintain, we can use the same set API for ourselves and our clients.
- Very light weight backend, compared with tranditional MVC framework, i.e. ASP.NET MVC, Spring MVC, ...
- Help to improve development productivity.
The challenge of using JavaScript framework heavily, especially on user facing pages (not administration pages), is that we use virtual elements or attributes and JavaScript binding JSON object, is not SEO friendly.Many search engines, social medias' crawlers even don't support to crawl/index JavaScript pages.
The good thing is that we can use PreRender.io to prerender page (which will execute the JavaScript on page) before it renders to search engine crawlers.
What is Prerender
Prerender.io is developed via Node.js, it allows your javascript apps to be crawled perfectly by search engines, social medias, and it is compatible with all JavaScript frameworks and libraries. It uses PhantomJS to render a javascript-rendered page as HTML. Also, we can implement cache on prerender service layer, which will be much better for performance.
PhantomJS is a headless WebKit scriptable with a JavaScript API. It has fast and native support for various web standards: DOM handling, CSS selector, JSON, Canvas, and SVG.
There are many prerender middlewares in different languages (middleware is the library which we can use it to implement prerender logic inside application):
The fulllist of prerender middleware is: https://prerender.io/documentation/install-middleware. Apache and Nginx are server container level middleware, others are application level middleware.
The official website is: https://prerender.io/. The github url: https://github.com/prerender
What are Prerender solutions
Generally, we have 3 different solutions to implement with prerender.io
Option 1: Application Level
Implement prerender.io logic on application level by using middlewares (i.e. NodeJS Express middleware, Ruby on Rails middleware, ASP.NET MVC middleware, ...)
- Request comes in
- Application will check if the request comes from crawlers (based on user agent information) or not.
- If request comes from crawlers, then appliaction will call prerender service with original URL as query string.
- Prerender service will call application
- Application returns the original HTML with JavaScript logic to prerender service
- Prerender service will execute JavaScript inside HTML, similar with browser
- Prerender service will return the final HTML to application.
- Appliaction will return the final HTML to browser.
- If request comes from regular users, application will execute the output and send back to browsers.
Option 2: Server Container Level
Implement prerender.io logic on Server container level. (i.e. Apache, Nginx, IIS) by using URL rewrite middlewares.
- Request comes in
- Server container (i.e. Apache, Nginx, IIS) will check if the request comes from crawlers (based on user agent information) or not.
- If request comes from crawlers, then rewrite URL (with original URL as query string) to prerender service.
- Prerender service will call application
- Application returns the original HTML with JavaScript logic
- Prerender service will execute JavaScript inside HTML, similar with browser
- Prerender service will return the final HTML to server container (i.e. Apache, Nginx, IIS).
- If request comes from regular users, then redirects traffic to application as normal. Application will execute and return output to server container.
Option 3: Network Level
Implement prerender.io logic on network level, by using load balance proxy, i.e. HA Proxy:
- Request comes in
- Load balance Proxy will check if the request comes from crawlers (based on user agent information) or not.
- If request comes from crawlers, then redirects traffic (with original URL as query string) to prerender service.
- Prerender service will call application
- Application returns the original HTML with JavaScript logic
- Prerender service will execute JavaScript inside HTML, similar with browser
- Prerender service will return the final HTML to load balance proxy.
- If request comes from regular users, then redirects traffic to application. Application will execute and return output to load balance proxy.
Solution comparison
Above 3 different solutions are solving the same problems on different levels, but they turns out different performance results.
- Option 1: Application Level
This solution is easy to be implemented, and easy to debug, but it also makes application heavy, as application needs to wait prerender service call application and execute JavaScript, this will take a lot of time, depends on how complicated the JavaScript logic it is. So, the bottleneck of this solution will be the application. The requests will be stuck on application level, server container level, and network level.
If prerender service is down, it will affect regular user access experience (long-time request to prerender service, consume resource on both application and server container).
- Option 2: Server Container Level
This solution leverages URL rewrite logic to move the bottleneck from application level to IIS level, at least, for application itself, it will be easier to be extended, and it's flexible on application level. The requests will be stuck on server container level, network level.
If prerender service is down, it will affect regular user access experience either (long-time request to prerender service, consume resource on server container).
- Option 3: Network Level
This solution will be implemented on highest level, on network level by using load balance, so, there is no bottleneck on server container and application, as it moves to load balance.
With this solution, even prerender service is down, it will not affect regular user access experience.
Based on above basic analysis, generally speaking, Option 3 is better than Option 2, Option 2 is better than Option 1.
Performance Concerns
Whatever which solution we will use, we should still think about how to improve performance, as executing JavaScript will take longer time than server side. On the other hand, since we only redirect the crawlers' traffic to prerender service, we don't need to provide the exactly up-to-date information for crawlers, I think we should use cache in prerender service to improve the performance, even we can cache 1 day. :)
Next post, I will explain how to implement prerender service by using the opensource project: https://github.com/prerender