source: products/quintagroup.plonegooglesitemaps/trunk/quintagroup/plonegooglesitemaps/filters.txt @ 3163

Last change on this file since 3163 was 3163, checked in by zidane, 13 years ago

fixes pyflakes and pylint

  • Property svn:eol-style set to native
File size: 10.5 KB
RevLine 
[2947]1
2Blackout filtering
3==================
4
[3005]5Introduction
6============
[2947]7
[3005]8Sitemap portal type has an option that filters objects that
9should be excluded from a sitemap. This option is accessable
10on sitemap edit form and is labeled as "Blackout entries".
[2947]11
[2997]12In earlier versions of the package (<4.0.1 for plone-4 branch
13and <3.0.7 for plone-3 branch) this field allowed to
[3005]14filter objects only by their ids, and it looked like:
[2997]15
[2947]16<pre>
17  index.html
18  index_html
19</pre>
20
[3005]21As a result, all objects with "index.html" or "index_html" ids
22were excluded from the sitemap.
[2947]23
[3005]24In the new versions of GoogleSitemaps filtering was refactored
25to pluggable architecture. Now filters turned to be named multi
26adapters. There are only two default filters: "id" and "path".
[2947]27
[2997]28Since different filters can be used - new syntax was applied
[2947]29to the "Blackout entries" field. Every record in the field
[2997]30should follow the specification:
[2947]31 
32  [<filter name>:]<filter arguments>
33
[3005]34* If no <filter name> is specified - "id" filter will be used.
35* If <filter name> is specified - system will look for
36  <filter name>-named  multiadapter to IBlackoutFilter interface.
37  If such multiadapter is not found - filter ill be ignored without
38  raising any errors.
[2947]39
[3005]40The following parts demonstrate how to work with filtering.
41Aspects of default filters ("id" and "path") will also be
42considered.
[2947]43
[3005]44Demonstration environment setup
[2997]45===============================
[2947]46
[3005]47First, we have to do some setup. We use testbrowser that is
[2997]48shipped with Five, as this provides proper Zope 2 integration. Most
49of the documentation, though, is in the underlying zope.testbrower
50package.
51
[2947]52    >>> from Products.Five.testbrowser import Browser
53    >>> browser = Browser()
54    >>> portal_url = self.portal.absolute_url()
55
[3005]56This is useful when writing and debugging testbrowser tests. It lets
57us see all error messages in the error_log.
[2947]58
59    >>> self.portal.error_log._ignored_exceptions = ()
60
[2997]61With that in place, we can go to the portal front page and log in.
62We will do this using the default user from PloneTestCase:
[2947]63
64    >>> from Products.PloneTestCase.setup import portal_owner, default_password
65    >>> browser.open(portal_url)
66
67We have the login portlet, so let's use that.
68
69    >>> browser.open('http://nohost/plone/login_form')
70    >>> browser.getLink('Log in').click()
71    >>> browser.url
72    'http://nohost/plone/login_form'
73    >>> browser.getControl('Login Name').value = portal_owner
74    >>> browser.getControl('Password').value = default_password
75    >>> browser.getControl('Log in').click()
76    >>> "You are now logged in" in browser.contents
77    True
78    >>> "Login failed" in browser.contents
79    False
80    >>> browser.url
81    'http://nohost/plone/login_form'
82
83
84Functionality
85=============
86
[3005]87First, create some content for demonstration purpose.
[2947]88
89In the root of the portal
90
91    >>> self.addDocument(self.portal, "doc1", "Document 1 text")
92    >>> self.addDocument(self.portal, "doc2", "Document 2 text")
93
94And in the memeber's folder
95
96    >>> self.addDocument(self.folder, "doc1", "Member Document 1 text")
97    >>> self.addDocument(self.folder, "doc2", "Member Document 2 text")
98
[2997]99We need to add sitemap for demonstration.
[2947]100
101    >>> browser.open(portal_url + "/prefs_gsm_settings")
102    >>> browser.getControl('Add Content Sitemap').click()
103   
[3005]104Now we are landed on the newly-created sitemap edit form.
105What we are interested in is "Blackout entries" field on the edit
106form, it should be empty by default settings.
[2997]107
[2949]108    >>> file("/tmp/browser.0.html","wb").write(browser.contents)
[2947]109    >>> blackout_list = browser.getControl("Blackout entries")
110    >>> blackout_list
111    <Control name='blackout_list:lines' type='textarea'>
[2949]112    >>> blackout_list.value == ""
113    True
[2948]114    >>> save_button = browser.getControl("Save")
[2947]115    >>> save_button
[2992]116    <SubmitControl name='form...' type='submit'>
[2948]117    >>> save_button.click()
[2947]118
119
[3005]120Clicking on "Save" button will lead us to the sitemap view.
[2947]121
[2950]122    >>> print browser.contents
123    <?xml version="1.0" encoding=...
[2947]124
[2950]125
[3005]126"sitemap.xml" link should appear on "Settings" page of the
127Plone Google Sitemap configlet after "Content Sitemap"
[2997]128was added.
[2947]129
[2949]130    >>> browser.open(portal_url + "/prefs_gsm_settings")
131    >>> smedit_link = browser.getLink('sitemap.xml')
[2950]132    >>> smedit_url = smedit_link.url
[2947]133
[3005]134This link points to the newly-created sitemap.xml edit form.
135Let's prepare view link to simplify the following demonstrations.
[2947]136
[2950]137    >>> smedit_url.endswith("sitemap.xml/edit")
[2949]138    True
[2950]139    >>> smview_url = smedit_url[:-5]
[2949]140
141
142No filters
143==========
144
[3005]145The created sitemap has no filters applied and all documents should appear in it.
[2949]146
[2950]147    >>> browser.open(smview_url)
[2949]148    >>> file("/tmp/browser.1.html","wb").write(browser.contents)
149    >>> no_filters_content = browser.contents
150
[3005]151Check if result page is really a sitemap...
[2949]152
[2950]153    >>> print browser.contents
154    <?xml version="1.0" encoding=...
[2949]155
[2950]156
[3005]157Create regular expression, which will help us to test which urls pass the filters.
[2949]158
[3163]159    >>> import re
[2949]160    >>> reloc = re.compile("<loc>%s([^\<]*)</loc>" % self.portal.absolute_url(), re.S)
161
[3005]162Test if all 4 documents and default front-page are in the sitemap without filters.
[2949]163
164    >>> no_filters_res = reloc.findall(no_filters_content)
165    >>> no_filters_res.sort()
166    >>> print "\n".join(no_filters_res)
167    /Members/test_user_1_/doc1
168    /Members/test_user_1_/doc2
169    /doc1
170    /doc2
171    /front-page
172
173
174Check "id" filter
175=================
176
[3005]177Go to the sitemap edit form and add "doc1" and "front-page" lines with "id:"
178prefix to the "Blackout entries" field.
[2949]179
[2950]180    >>> browser.open(smedit_url)
[2949]181    >>> filtercontrol = browser.getControl("Blackout entries")
[2952]182    >>> filtercontrol.value = """
183    ...     id:doc1
184    ...     id:front-page
185    ... """
[2949]186    >>> browser.getControl("Save").click()
187    >>> id_filter_content = browser.contents
188
[3005]189"doc1" and "front-page" documents should now be excluded from the
[2997]190sitemap.
[2949]191
192    >>> id_filter_res = reloc.findall(id_filter_content)
193    >>> id_filter_res.sort()
194    >>> print "\n".join(id_filter_res)
195    /Members/test_user_1_/doc2
196    /doc2
197
198
199Check "path" filter
200===================
201
[3005]202Suppose we want to exclude "front_page" from portal root and "doc2"
203document, located in test_user_1_ home folder, but leave "doc2"
204untouched in portal root with all other objects.
[2949]205
[2950]206    >>> browser.open(smedit_url)
[2949]207    >>> filtercontrol = browser.getControl("Blackout entries")
[2952]208    >>> filtercontrol.value = """
209    ...    path:/Members/test_user_1_/doc2
210    ...    path:/front-page
211    ... """
[2949]212    >>> browser.getControl("Save").click()
213    >>> path_filter_content = browser.contents
214
[2997]215"/Members/test_user_1_/doc2" and "/front_page" objects should
216be excluded from the sitemap.
[2949]217
218    >>> path_filter_res = reloc.findall(path_filter_content)
219    >>> path_filter_res.sort()
220    >>> print "\n".join(path_filter_res)
[3000]221    /Members/test_user_1_/doc1
[2949]222    /doc1
223    /doc2
224
225
226Check default filter
227====================
228
[3005]229Now I have a question: "What filter will be used when no
230filter name prefix is specified (e.g. old-fashion filters)?"
[2949]231
[3005]232Go to the sitemap edit form and add "doc1" and "front-page"
233lines without any filter name prefix to the "Blackout entries"
234field.
[2949]235
236    >>> browser.open(portal_url + "/sitemap.xml/edit")
237    >>> filtercontrol = browser.getControl("Blackout entries")
[2952]238    >>> filtercontrol.value = """
239    ...     doc1
240    ...     front-page
241    ... """
[2949]242    >>> browser.getControl("Save").click()
243    >>> default_filter_content = browser.contents
244
[3005]245"id" filter must be used as default filter. So, all "doc1" and
[2997]246"front-page" objects should be excluded from the sitemap.
[2949]247
248    >>> default_filter_res = reloc.findall(default_filter_content)
249    >>> default_filter_res.sort()
250    >>> print "\n".join(default_filter_res)
251    /Members/test_user_1_/doc2
252    /doc2
253
254
[3005]255Create your own filters
256=======================
[2951]257
[3005]258Suppose we want to create our own blackout filter,  which will
259behave like id-filter, but will have some differences. Our fitler
260has the following format:
[2951]261
262  (+|-)<filtered id>
263
[3005]264- if the 1st sign is "+" then only objects with <filtered id>
265  should be left in sitemap after filetering;
266- if the 1st sign is "-" then all objects with <filtered id>
267  should be excluded from the sitemap (like default id filter).
[2951]268
[3005]269You need to create new IBlckoutFilter multi-adapter, and register
270it with unique name.
[2951]271
272    >>> from zope.component import adapts
273    >>> from zope.interface import Interface, implements
274    >>> from zope.publisher.interfaces.browser import IBrowserRequest
275    >>> from quintagroup.plonegooglesitemaps.interfaces import IBlackoutFilter
276    >>> class SignedIdFilter(object):
277    ...     adapts(Interface, IBrowserRequest)
278    ...     implements(IBlackoutFilter)
279    ...     def __init__(self, context, request):
280    ...         self.context = context
281    ...         self.request = request
282    ...     def filterOut(self, fdata, fargs):
283    ...         sign = fargs[0]
284    ...         fid = fargs[1:]
285    ...         if sign == "+":
286    ...             return [b for b in fdata if b.getId==fid]
287    ...         elif sign == "-":
288    ...             return [b for b in fdata if b.getId!=fid]
289    ...         return fdata
290
291
292Now register this new filter as named multiadapter ...
293
294    >>> from zope.component import provideAdapter
295    >>> provideAdapter(SignedIdFilter,
296    ...                name=u'signedid')
297
[3005]298So that's all what needed to add new filter. Now test newly-created
299filter.
[2951]300
[2997]301Check whether white filtering ("+" prefix) works correctly.
[3005]302Go to the sitemap edit form and add "signedid:+doc1"
[2951]303to the "Blackout entries" field.
304
305    >>> browser.open(smedit_url)
306    >>> filtercontrol = browser.getControl("Blackout entries")
[2952]307    >>> filtercontrol.value = """
308    ...    signedid:+doc1
309    ... """
[2951]310    >>> browser.getControl("Save").click()
311    >>> signedid_filter_content = browser.contents
312
[3005]313Only objects with "doc1" id should be left in the sitemap.
[2951]314
315    >>> signedid_filter_res = reloc.findall(signedid_filter_content)
316    >>> signedid_filter_res.sort()
317    >>> print "\n".join(signedid_filter_res)
318    /Members/test_user_1_/doc1
319    /doc1
320
321
[3005]322Finally, check whether black filtering ("-" prefix) works correctly.
323Go to the sitemaps edit form and add "signedid:-doc1" to the "Blackout
324entries" field.
[2951]325
326    >>> browser.open(smedit_url)
327    >>> filtercontrol = browser.getControl("Blackout entries")
[2952]328    >>> filtercontrol.value = """
329    ...     signedid:-doc1
330    ... """
[2951]331    >>> browser.getControl("Save").click()
332    >>> signedid_filter_content = browser.contents
333
[3005]334All objects, except those having "doc1" id, must be included in
[2997]335the sitemap.
[2951]336
337    >>> signedid_filter_res = reloc.findall(signedid_filter_content)
338    >>> signedid_filter_res.sort()
339    >>> print "\n".join(signedid_filter_res)
340    /Members/test_user_1_/doc2
341    /doc2
342    /front-page
Note: See TracBrowser for help on using the repository browser.