Google bot crawling on AngularJS site with HTML5 Mode routes -


we have angularjs site using html5 routes. did test "fetch google" runs. results bit confusing:

however, prepared google not able crawl our site, have added , google bot revisits our page “?_escaped_fragment_=". followed this, https://developers.google.com/webmasters/ajax-crawling/docs/getting-started (section "3. handle pages without hash fragments"). in our nginx config have this:

if ($args ~ "_escaped_fragment_=") {     serve static html snapshots } 

, , indeed works fine, if pass _escaped_fragment_= ourselves. however, google bot never tried crawl our site param, never crawled snapshot. missing something? should add agent detection google bot on our nginx conf? this?

if ($http_user_agent ~* "googlebot|yahoo|bingbot|baiduspider|yandex|yeti|yodaobot|gigabot|ia_archiver|facebookexternalhit|twitterbot|developers\.google\.com") {              server snapshots  } 

it great if can understand better, thank in advance!

update:
read this, http://scotch.io/tutorials/javascript/angularjs-seo-with-prerender-io?_escaped_fragment_=tag#caveats. so, seems when using manual tools (fetch google), should pass ourselves either #! or ?_escaped_fragment_= in right place. indeed, if pass ?_escaped_fragment_= in our case, see html snapshot have created.

is true? how works indeed?

update 2 on bottom of thread, google employee verifies google webmasters "fetch google", need manually pass _escaped_fragment_= param yourself, https://productforums.google.com/forum/#!msg/webmasters/fzjdyjq0n98/pz-nlq_2rjcj

cheers,
iraklis

i try answer questions based on our experiences in last month of developing spa html5 mode.

how googlebot use ?_escaped_fragment_= instead of direct links.

this quite simple easy overlook. in fact, there 2 different ways googlebot try escaped_fragment. first method run site in non-html5 mode. means urls of form:

http://my.domain.com/base/#!some/path/on/website

googlebot recognizes #! , makes second call server altered url:

http://my.domain.com/base/?_escaped_fragment_=some/path/on/website

which can handle wish. second way googlebot try _escaped_fragment_ mode include following meta tag on index page supply bot:

<meta name="fragment" content="!"> 

this make googlebot check other version of webpage every time sees tag. interestingly can use both these techniques or can ended doing, running in html5 mode meta tag. means urls escaped follows:

http://my.domain.com/base/some/path/on/website?_escaped_fragment_=

interestingly, bot not put @ end of fragment. depending on webserver running, can map pattern matching "_escaped_fragment_" text alternate bot page. more information on escaped fragment go here.

"fetch googlebot" returns 2 different versions of page, source {{}} , rendered page looking correct. mean?

google's bots can interpret javascript limited extent since 2014. more information, read official blog entry on google webmasters here. however, made clear in blog entry, comes lot of caveats. instance:

  1. googlebot not guarantee execute javascript code.
  2. googlebot attempt find links in javascript follow , use them find more pages.
  3. googlebot render preview in webmasters tools executing of javascript can (thus lack of {{}} in rendered version).
  4. googlebot not use rendered version in order build meta information site index.

as of 18/12/2014, still unsure if googlebot can extract information spa in rendered mode index beyond finding links follow in javascript. in our experience, googlebot include {{}} in index listing when try use {{}} fill meta information (description, keywords, title, etc...) site looks in google search results:

{{meta.sitetitle}}
http://my.domain.com/base/some/path/on/website
{{meta.description}}

rather expect might this:

domain
http://my.domain.com/base/some/path/on/website
random page on domain. excellent example page sure!


Comments

Popular posts from this blog

javascript - RequestAnimationFrame not working when exiting fullscreen switching space on Safari -

linux - phpmyadmin, neginx error.log - Check group www-data has read access and open_basedir -