it sounds a silly question, but I actually could not find a reference to it. I have created a sitemap_index with about 100-ish sitemaps. No errors.
I submitted only the sitemap_index to Google webmaster.
It displayes only one sitemap indexed (1 Url) that should contain 50K links, but if a test:
does show in fact only ten pages which were crawled not from the sitemap, with no links from the sitemaps.
I uploaded the sitemap_index and sitemaps more than one week ago (I re-uploaded everything today, as a test - dates refers to today).
I read through similar question: [Too much delay for indexing Sitemap in Webmaster Tools?
all urls are live ones, not redirected.
URLs in sitemaps have are like:
The type of content that should get indexed contains names of topics. The site is about data-visualisation: data is then manipulated via ajax, but html page contains meta populated via server side: a url was successfully indexed as test for a single submit, not from the sitemaps.
I just cannot understand if my sitemaps are actually getting crawled and eventually indexed or not.
If not, I cannot understand why not, and cannot understand the discrepancy for having at least 1 sitemap in sitemap_index, indexed, but a test
site:example.com not showing any result from it.
I would be happy to share sitename in private, in case you would like to have a closer look at how pages are created - I don't want to post it not to make it sound as advertisement or such!.
My robots.txt includes only the sitemap_index (here sitemap.xml) :
User-agent: * Allow: / Sitemap: http://example.com/sitemap.xml
Should I include all sitemaps mentioned in sitemap_index also in robots.txt?
Your sitemaps are getting crawled. To say they are being indexed would be a misnomer as the sitemap is simply giving Google a list of URL's to crawl and index, but while the sitemaps are getting crawled by Google to get the URL's they contain the sitemaps don't get indexed themselves.
The screenshot you are showing indicates that all of the sitemaps in the sitemap index have been detected by Google and all of the URL's have been added to the queue to be crawled. This is where the "web pages" box comes into play. What this indicates is how many URL's have been pulled out of the sitemap files and submitted to the crawling queue to be crawled, and out of them how many have actually been indexed thus far.
From what you have provided ion your question and comments everything appears to be going fine. The fact that you have over 5 million URL's will inevitably mean that it will take some time for Google to finish crawling and indexing them all so best thing is to hang tight and keep an eye on it. As each page is crawled and indexed you will see the number 1 start go higher and higher.