Output Options

CDN

S3 Signed Url

S3 Destination

FTP Destination

Azure Destination

IAM Users

Other Service Signed Urls

Getting Started

Blitline Job

Welcome

Job Options

List of Available Functions

Examples List

Functions

Job Response

Polling and Postbacks

Service Limits

Development Recommendations

Gotchas

Advanced

Pipelines

Fonts

Smart Image

Static IPs

Formats

Colorspace

Color Extraction

PDF

Image Optimization

Metadata

Signed Jobs

Subimage

Special (Non-Image) Processing

Zipping

Trancoding Video Presets

Apache Tika

AWS Rekognition/Facial Recognition

Animated GIFs

Building Gifs or Videos from Images

Find Image on the Internet

Video Keyframes

Video Transcoding

Screenshots of Websites

IM Scripts

Vector Processing

Screenshots of Websites

Updated 1 year ago by Blitline Support

BLITLINE SCREENSHOTS

Blitline can take a screenshot of your website.

Give Blitline a url and we will load your page as an image and you can than run and additional Blitline operations on it.

The JSON for screenshots works like a regular Blitline job, except you need to change two fields:

  • The ‘src’ field for the job becomes the url for the screenshot you want to take
  • You must add an additional ‘src_type’ : ‘screen_shot_url’ to the base job.

For example… you could take the screenshot to the right by simply submitting the following JSON:

             {  "application_id": "YOUR_APP_ID",
                "src" : "https://metrics.librato.com/",
                "src_type" : "screen_shot_url",
                "functions" :
                [{
                   "name": "resize_to_fit",
                   "params" : { "width" : "400" },
                   "save" : { "image_identifier" : "example"}
                }]
             }

Check out a live demo here

ARE THERE OTHER OPTIONS FOR SCREENSHOTS?

YES. You can set these options using a “src_data” element that is a sibling to the “src_type”.

src_data (optional: 1 time)

  • viewport - Sometimes the display of the website is dynamic and based on the size of the browser window. So we have provided a way for you to specify the “viewport” or window size of the browser. This value follows the format of “WIDTHxHEIGHT”.(optional: 1 time)

  • delay - Some web pages render stuff with javascript after the page has loaded. By default we wait 5 seconds for a page to finishe rendering before taking the screenshot. If you want this to be more or less you can set the delay in milliseconds before rendering. (default 5000, max 30000)(optional: 1 time)

  • save_html - Some web pages render stuff with javascript after the page has loaded. If you want to capture the HTML from the rendered page that we took a screenshot of, add the save_html option with a destination (s3/azure)(optional: 1 time)

  • s3_destination - or azure_destination An optional container to identify where you would like us to push the HTML to (if you have an Amazon S3 account which you have given Blitline permission to write to)(optional: 1 time)

    • bucket - Your S3 bucket to push to (required: 1 time) * Bucket in a different region?

    • key The S3 key for the HTML that you wish Blitline to write to (you will need to name it, this key should include the .html filename)(required: 1 time)

    • headers Optional array of headers to set on the object we push to S3(optional: 1 time)

So… to see this is action we can go something like this:

             {  "application_id": "YOUR_APP_ID",
                "src" : "https://metrics.librato.com/",
                "src_type" : "screen_shot_url",
                "src_data" : {
                     "viewport" : "1200x800",
                     "delay" : 5000
                },
                "functions" :
                [{
                   "name": "resize_to_fit",
                   "params" : { "width" : "400" },
                   "save" : { "image_identifier" : "example"}
                }]
             }

Check out a live demo here

SEO


For modern dynamic websites, it is difficult for Google to properly index your page. Blitline has a screenshot functionality which allows us to render your webpage, not only as an image, but additionally we can output the final generated HTML. You can choose to save this HTML out, and then point Google to the pre-rendered pages, so that it can properly index your website.

NOTE: Webkit Only. Sorry, NO IE or Firefox :(

You can redirect search engines to your static HTML pages instead of the dynamic ones.

This example assumes your site is built following the conventions of Google’s recommendation for searchable ajax.

1 - GENERATE STATIC PAGES

The first thing you need to do is to generate your static pages. You can do this by following the instructions here or see the example below.

You will need to do this for all possible navigation options on your website. (ie. every #! url).


Recommendation:

If you page is www.myexamplesite.com/#!seattl... we recommend you save them into an S3 bucket (or Azure container) with the key of seattle/hotels


Example:

          {
            "application_id": "YOUR_APP_ID",
            "src" : "www.myexamplesite.com/#!seattle/hotels",
            "src_type" : "screen_shot_url",
            "src_data" : {
                 "viewport" : "1200x800",
                 "save_html" : {
                     "s3_destination" : {
                         "bucket" : "my_s3_bucket",
                         "key" : "seattle/hotels"
                     }
                 }
            "functions" :
            [{
               "name": "no_op"
            }]
          }

NOTE: You must replace YOUR_APP_IDmy_s3_bucketmyexamplesite.com, and the example keyswith your OWN related resources.


2 - UPDATE YOUR REVERSE PROXY

Now that you have your static pages generated, you need to make sure the search engines look in the right spot for them.


If your followed Google’s recommendation for searchable ajax, search engines will covert the ‘#!’ in the url into the parameter ‘_escaped_fragment=’

So, now when a bot comes to look for a url with:

          http://www.myexamplesite.com?_escaped_fragment=seattle/hotels

it will redirect it to the cached static HTML version we stored on S3.


NGINX

To make Nginx redirect to your static pages, simple add the following snippet to your nginx.conf file within the “server” section.


          if ($args ~ "_escaped_fragment_=(.+)") {
            set $real_url $1;
            rewrite ^ http://my_s3_bucket/$real_url;
          }

APACHE

To make Apache redirect to your static pages, change your .htaccess file to have the following.


          RewriteCond %{QUERY_STRING} ^_escaped_fragment_=(.*)$
          RewriteRule ^(.*)$  http://my_s3_bucket/$1 [P,QSA,L]
          Alternative Site-Map Option
          


You could also submit an XML document to Google Webmaster Tools with a list of your rendered pages, as outlined in this document http://ajax.rswebanalytics.com...


PAGES WITHOUT HASH FRAGMENTS

According to the Google spec, “In order to make pages without hash fragments crawlable, you include a special meta tag in the head of the HTML of your page. The meta tag takes the following form:”

          <meta name="fragment" content="!">

Please read more about this Googlebot feature here: https://developers.google.com/...


COST?

Unless you are runnning SEO dumps every day, or you have a massive site, you probably won’t even need a paid account here at Blitline. We think many of you can probably get by on the FREE developer account. Try it… what have you got to lose?


Enjoy



How did we do?