Blackout filtering
==================
Sitemaps has option to filterout objects, which shouldn't present
in a sitemap. This option is accessable in sitemap edit form
and present as "Blackout entries" lines field.
In earlier (<3.0.7 and <4.0.1) versions of the package the field
filter-out objects only by its ids, and looks like:
index.html
index_html
So all objects with "index.html" or "index_html" ids excluded
from the sitemap.
In the new versions of GoogleSitemaps filtering was remaked
to pluggable architecture. Now filters are named mutli adapters.
By default there are only two most useful filters - "id" and
"path".
Because of different filters can be used - new syntax applied
to the "Blackout entries" field. Every record in the field
should follow the spec:
[:]
By default (if no specified) - "id" filter will
be used. If specified - system looking for
name multiadapter to IBlackoutFilter interface.
If such multiadapter was not found - it's ignored silently.
Setup
=====
First, we must perform some setup. We use the testbrowser that is shipped
with Five, as this provides proper Zope 2 integration. Most of the
documentation, though, is in the underlying zope.testbrower package.
>>> from Products.Five.testbrowser import Browser
>>> browser = Browser()
>>> portal_url = self.portal.absolute_url()
The following is useful when writing and debugging testbrowser tests. It lets
us see all error messages in the error_log.
>>> self.portal.error_log._ignored_exceptions = ()
With that in place, we can go to the portal front page and log in. We will
do this using the default user from PloneTestCase:
>>> from Products.PloneTestCase.setup import portal_owner, default_password
>>> browser.open(portal_url)
We have the login portlet, so let's use that.
>>> browser.open('http://nohost/plone/login_form')
>>> browser.getLink('Log in').click()
>>> browser.url
'http://nohost/plone/login_form'
>>> browser.getControl('Login Name').value = portal_owner
>>> browser.getControl('Password').value = default_password
>>> browser.getControl('Log in').click()
>>> "You are now logged in" in browser.contents
True
>>> "Login failed" in browser.contents
False
>>> browser.url
'http://nohost/plone/login_form'
Functionality
=============
First create several documents for demonstrations.
In the root of the portal
>>> self.addDocument(self.portal, "doc1", "Document 1 text")
>>> self.addDocument(self.portal, "doc2", "Document 2 text")
And in the memeber's folder
>>> self.addDocument(self.folder, "doc1", "Member Document 1 text")
>>> self.addDocument(self.folder, "doc2", "Member Document 2 text")
We need add sitemap, of corse, for demonstration.
>>> browser.open(portal_url + "/prefs_gsm_settings")
>>> browser.getControl('Add Content Sitemap').click()
Now we bring-up to edit form of the newly created content sitemap.
We interested in two things: "Blackout entries" field must present
in the form and it should be empty by default.
>>> file("/tmp/browser.0.html","wb").write(browser.contents)
>>> blackout_list = browser.getControl("Blackout entries")
>>> blackout_list
>>> blackout_list.value == ""
True
>>> save_button = browser.getControl("Save")
>>> save_button
>>> save_button.click()
Click on "Save" button lead us to result sitemap view.
>>> print browser.contents
>> browser.open(portal_url + "/prefs_gsm_settings")
>>> smedit_link = browser.getLink('sitemap.xml')
>>> smedit_url = smedit_link.url
This link lead to edit form of the newly created sitemap.xml.
Also prepare view link to simplifier following demonstrations.
>>> smedit_url.endswith("sitemap.xml/edit")
True
>>> smview_url = smedit_url[:-5]
No filters
==========
Resulted sitemap has no filters - all document should present in it.
>>> browser.open(smview_url)
>>> file("/tmp/browser.1.html","wb").write(browser.contents)
>>> no_filters_content = browser.contents
Check if resulted page is real sitemap...
>>> print browser.contents
>> reloc = re.compile("%s([^\<]*)" % self.portal.absolute_url(), re.S)
With help of reloc regular expression - check if all 4 documents + default
front-page present in the sitemap without filters.
>>> no_filters_res = reloc.findall(no_filters_content)
>>> no_filters_res.sort()
>>> print "\n".join(no_filters_res)
/Members/test_user_1_/doc1
/Members/test_user_1_/doc2
/doc1
/doc2
/front-page
Check "id" filter
=================
Go to the edit form of the sitemap and add "doc1"
and "front-page" lines with "id:" prefix to the
"Blackout entries" field.
>>> browser.open(smedit_url)
>>> filtercontrol = browser.getControl("Blackout entries")
>>> filtercontrol.value = """
... id:doc1
... id:front-page
... """
>>> browser.getControl("Save").click()
>>> id_filter_content = browser.contents
As result - all "doc1" and "front-page" documents must be
filtered-out from the sitemap.
>>> id_filter_res = reloc.findall(id_filter_content)
>>> id_filter_res.sort()
>>> print "\n".join(id_filter_res)
/Members/test_user_1_/doc2
/doc2
Check "path" filter
===================
Suppouse we wont to filter-out doc2 of the test_user_1_'s (but
not from the portal root) and the front-page from the portal root.
>>> browser.open(smedit_url)
>>> filtercontrol = browser.getControl("Blackout entries")
>>> filtercontrol.value = """
... path:/Members/test_user_1_/doc2
... path:/front-page
... """
>>> browser.getControl("Save").click()
>>> path_filter_content = browser.contents
As result - "doc2" of the pointed member and "front-page" documents
must be filtered-out from the sitemap.
>>> path_filter_res = reloc.findall(path_filter_content)
>>> path_filter_res.sort()
>>> print "\n".join(path_filter_res)
/Members/test_user_1_/doc1
/doc1
/doc2
Check default filter
====================
Lets check what filter should be used for old-feshion filters
(without any filter name prefixes)?
Go to the edit form of the sitemap and add "doc1" and front-page
lines without any filter name prefix to the "Blackout entries"
field.
>>> browser.open(portal_url + "/sitemap.xml/edit")
>>> filtercontrol = browser.getControl("Blackout entries")
>>> filtercontrol.value = """
... doc1
... front-page
... """
>>> browser.getControl("Save").click()
>>> default_filter_content = browser.contents
By default "id" filter must be used, so all "doc1" and "front-page"
objects must be filtered-out from the sitemap.
>>> default_filter_res = reloc.findall(default_filter_content)
>>> default_filter_res.sort()
>>> print "\n".join(default_filter_res)
/Members/test_user_1_/doc2
/doc2
Creation own filters
====================
Suppouse we want to create own blackout filter,
which behave like id-filter, but with some differencies.
Our fitler has following format:
(+|-)
- when 1st sign "+" then only objects with
must leave after filetering,
- if 1st sign is "-" or all objects with must be
filtered-out (like default id filter)
You need create new IBlckoutFilter multi-adapter,
and register it with unique name.
>>> from zope.component import adapts
>>> from zope.interface import Interface, implements
>>> from zope.publisher.interfaces.browser import IBrowserRequest
>>> from quintagroup.plonegooglesitemaps.interfaces import IBlackoutFilter
>>> class SignedIdFilter(object):
... adapts(Interface, IBrowserRequest)
... implements(IBlackoutFilter)
... def __init__(self, context, request):
... self.context = context
... self.request = request
... def filterOut(self, fdata, fargs):
... sign = fargs[0]
... fid = fargs[1:]
... if sign == "+":
... return [b for b in fdata if b.getId==fid]
... elif sign == "-":
... return [b for b in fdata if b.getId!=fid]
... return fdata
Now register this new filter as named multiadapter ...
>>> from zope.component import provideAdapter
>>> provideAdapter(SignedIdFilter,
... name=u'signedid')
So thet's all what neede to add new filter.
No test if newly added filter really take into consideration.
Check if white filtering ("+" prefix) work correctly.
Go to the edit form of the sitemap and add "signedid:+doc1"
to the "Blackout entries" field.
>>> browser.open(smedit_url)
>>> filtercontrol = browser.getControl("Blackout entries")
>>> filtercontrol.value = """
... signedid:+doc1
... """
>>> browser.getControl("Save").click()
>>> signedid_filter_content = browser.contents
As result - only objects with "doc1" id must present in the sitemap.
>>> signedid_filter_res = reloc.findall(signedid_filter_content)
>>> signedid_filter_res.sort()
>>> print "\n".join(signedid_filter_res)
/Members/test_user_1_/doc1
/doc1
An for the last - check black filtering ("-" prefix) is working.
Go to the edit form of the sitemap and add "signedid:-doc1"
to the "Blackout entries" field.
>>> browser.open(smedit_url)
>>> filtercontrol = browser.getControl("Blackout entries")
>>> filtercontrol.value = """
... signedid:-doc1
... """
>>> browser.getControl("Save").click()
>>> signedid_filter_content = browser.contents
As result - all except objects with "doc1" id must present in the sitemap.
>>> signedid_filter_res = reloc.findall(signedid_filter_content)
>>> signedid_filter_res.sort()
>>> print "\n".join(signedid_filter_res)
/Members/test_user_1_/doc2
/doc2
/front-page