Getting Started

​ ​Welcome

Blitline Job

Job Response

Job Options

Functions

Polling and Postbacks

Gotchas

Development Recommendations

Service Limits

Examples List

List of Available Functions

Output Options

S3 Destination

IAM Users

S3 Signed Url

Azure Destination

FTP Destination

Other Service Signed Urls

CDN

Advanced

Metadata

Colorspace

Color Extraction

Static IPs

Signed Jobs

Formats

Pipelines

Image Optimization

Smart Image

PDF

Fonts

Subimage

Special (Non-Image) Processing

AWS Rekognition/Facial Recognition

Apache Tika

Vector Processing

Animated GIFs

Screenshots of Websites

IM Scripts

Video Keyframes

Zipping

​ ​Trancoding Video Presets

​ ​Video Transcoding

Screenshots of Websites

Updated a month ago ​by Blitline Support

BLITLINE SCREENSHOTS

Blitline can take a screenshot of your website.

Give Blitline a url and we will load your page as an image and you can than run and additional Blitline operations on it.

The JSON for screenshots works like a regular Blitline job, except you need to change two fields:

  • The ‘src’ field for the job becomes the url for the screenshot you want to take
  • You must add an additional ‘src_type’ : ‘screen_shot_url’ to the base job.

For example… you could take the screenshot to the right by simply submitting the following JSON:

   {  "application_id": "YOUR_APP_ID",
      "src" : "https://metrics.librato.com/",
      "src_type" : "screen_shot_url",
      "functions" :
      [{
         "name": "resize_to_fit",
         "params" : { "width" : "400" },
         "save" : { "image_identifier" : "example"}
      }]
   }

Check out a live demo here

ARE THERE OTHER OPTIONS FOR SCREENSHOTS?

YES. You can set these options using a “src_data” element that is a sibling to the “src_type”.

src_data (optional: 1 time)

  • viewport - Sometimes the display of the website is dynamic and based on the size of the browser window. So we have provided a way for you to specify the “viewport” or window size of the browser. This value follows the format of “WIDTHxHEIGHT”.(optional: 1 time)

  • delay - Some web pages render stuff with javascript after the page has loaded. By default we wait 5 seconds for a page to finishe rendering before taking the screenshot. If you want this to be more or less you can set the delay in milliseconds before rendering. (default 5000, max 30000)(optional: 1 time)

  • save_html - Some web pages render stuff with javascript after the page has loaded. If you want to capture the HTML from the rendered page that we took a screenshot of, add the save_html option with a destination (s3/azure)(optional: 1 time)

  • s3_destination - or azure_destination An optional container to identify where you would like us to push the HTML to (if you have an Amazon S3 account which you have given Blitline permission to write to)(optional: 1 time)

    • bucket - Your S3 bucket to push to (required: 1 time) * Bucket in a different region?

    • key The S3 key for the HTML that you wish Blitline to write to (you will need to name it, this key should include the .html filename)(required: 1 time)

    • headers Optional array of headers to set on the object we push to S3(optional: 1 time)

So… to see this is action we can go something like this:

   {  "application_id": "YOUR_APP_ID",
      "src" : "https://metrics.librato.com/",
      "src_type" : "screen_shot_url",
      "src_data" : {
           "viewport" : "1200x800",
           "delay" : 5000
      },
      "functions" :
      [{
         "name": "resize_to_fit",
         "params" : { "width" : "400" },
         "save" : { "image_identifier" : "example"}
      }]
   }

Check out a live demo here

SEO


For modern dynamic websites, it is difficult for Google to properly index your page. Blitline has a screenshot functionality which allows us to render your webpage, not only as an image, but additionally we can output the final generated HTML. You can choose to save this HTML out, and then point Google to the pre-rendered pages, so that it can properly index your website.

NOTE: Webkit Only. Sorry, NO IE or Firefox :(

You can redirect search engines to your static HTML pages instead of the dynamic ones.

This example assumes your site is built following the conventions of Google’s recommendation for searchable ajax.

1 - GENERATE STATIC PAGES

The first thing you need to do is to generate your static pages. You can do this by following the instructions here or see the example below.

You will need to do this for all possible navigation options on your website. (ie. every #! url).


Recommendation:

If you page is www.myexamplesite.com/#!seattl... we recommend you save them into an S3 bucket (or Azure container) with the key of seattle/hotels


Example:

{
  "application_id": "YOUR_APP_ID",
  "src" : "www.myexamplesite.com/#!seattle/hotels",
  "src_type" : "screen_shot_url",
  "src_data" : {
       "viewport" : "1200x800",
       "save_html" : {
           "s3_destination" : {
               "bucket" : "my_s3_bucket",
               "key" : "seattle/hotels"
           }
       }
  "functions" :
  [{
     "name": "no_op"
  }]
}

NOTE: You must replace YOUR_APP_IDmy_s3_bucketmyexamplesite.com, and the example keyswith your OWN related resources.


2 - UPDATE YOUR REVERSE PROXY

Now that you have your static pages generated, you need to make sure the search engines look in the right spot for them.


If your followed Google’s recommendation for searchable ajax, search engines will covert the ‘#!’ in the url into the parameter ‘_escaped_fragment=’

So, now when a bot comes to look for a url with:

http://www.myexamplesite.com?_escaped_fragment=seattle/hotels

it will redirect it to the cached static HTML version we stored on S3.


NGINX

To make Nginx redirect to your static pages, simple add the following snippet to your nginx.conf file within the “server” section.


if ($args ~ "_escaped_fragment_=(.+)") {
  set $real_url $1;
  rewrite ^ http://my_s3_bucket/$real_url;
}

APACHE

To make Apache redirect to your static pages, change your .htaccess file to have the following.


RewriteCond %{QUERY_STRING} ^_escaped_fragment_=(.*)$
RewriteRule ^(.*)$  http://my_s3_bucket/$1 [P,QSA,L]
Alternative Site-Map Option


You could also submit an XML document to Google Webmaster Tools with a list of your rendered pages, as outlined in this document http://ajax.rswebanalytics.com...


PAGES WITHOUT HASH FRAGMENTS

According to the Google spec, “In order to make pages without hash fragments crawlable, you include a special meta tag in the head of the HTML of your page. The meta tag takes the following form:”

<meta name="fragment" content="!">

Please read more about this Googlebot feature here: https://developers.google.com/...


COST?

Unless you are runnning SEO dumps every day, or you have a massive site, you probably won’t even need a paid account here at Blitline. We think many of you can probably get by on the FREE developer account. Try it… what have you got to lose?


Enjoy



How did we do?