source: products/quintagroup.plonegooglesitemaps/trunk/quintagroup/plonegooglesitemaps/filters.txt @ 3247

Last change on this file since 3247 was 3247, checked in by zidane, 13 years ago

added plone4.1 compatibility

  • Property svn:eol-style set to native
File size: 10.4 KB
RevLine 
[2947]1
2Blackout filtering
3==================
4
[3005]5Introduction
6============
[2947]7
[3005]8Sitemap portal type has an option that filters objects that
9should be excluded from a sitemap. This option is accessable
10on sitemap edit form and is labeled as "Blackout entries".
[2947]11
[2997]12In earlier versions of the package (<4.0.1 for plone-4 branch
13and <3.0.7 for plone-3 branch) this field allowed to
[3005]14filter objects only by their ids, and it looked like:
[2997]15
[2947]16<pre>
17  index.html
18  index_html
19</pre>
20
[3005]21As a result, all objects with "index.html" or "index_html" ids
22were excluded from the sitemap.
[2947]23
[3005]24In the new versions of GoogleSitemaps filtering was refactored
25to pluggable architecture. Now filters turned to be named multi
26adapters. There are only two default filters: "id" and "path".
[2947]27
[2997]28Since different filters can be used - new syntax was applied
[2947]29to the "Blackout entries" field. Every record in the field
[2997]30should follow the specification:
[2947]31 
32  [<filter name>:]<filter arguments>
33
[3005]34* If no <filter name> is specified - "id" filter will be used.
35* If <filter name> is specified - system will look for
36  <filter name>-named  multiadapter to IBlackoutFilter interface.
37  If such multiadapter is not found - filter ill be ignored without
38  raising any errors.
[2947]39
[3005]40The following parts demonstrate how to work with filtering.
41Aspects of default filters ("id" and "path") will also be
42considered.
[2947]43
[3005]44Demonstration environment setup
[2997]45===============================
[2947]46
[3005]47First, we have to do some setup. We use testbrowser that is
[2997]48shipped with Five, as this provides proper Zope 2 integration. Most
49of the documentation, though, is in the underlying zope.testbrower
50package.
51
[2947]52    >>> from Products.Five.testbrowser import Browser
53    >>> browser = Browser()
54    >>> portal_url = self.portal.absolute_url()
55
[3005]56This is useful when writing and debugging testbrowser tests. It lets
57us see all error messages in the error_log.
[2947]58
59    >>> self.portal.error_log._ignored_exceptions = ()
60
[2997]61With that in place, we can go to the portal front page and log in.
62We will do this using the default user from PloneTestCase:
[2947]63
64    >>> from Products.PloneTestCase.setup import portal_owner, default_password
65    >>> browser.open(portal_url)
66
67We have the login portlet, so let's use that.
68
69    >>> browser.open('http://nohost/plone/login_form')
70    >>> browser.getControl('Login Name').value = portal_owner
71    >>> browser.getControl('Password').value = default_password
72    >>> browser.getControl('Log in').click()
73    >>> "You are now logged in" in browser.contents
74    True
75    >>> "Login failed" in browser.contents
76    False
77    >>> browser.url
78    'http://nohost/plone/login_form'
79
80
81Functionality
82=============
83
[3005]84First, create some content for demonstration purpose.
[2947]85
86In the root of the portal
87
88    >>> self.addDocument(self.portal, "doc1", "Document 1 text")
89    >>> self.addDocument(self.portal, "doc2", "Document 2 text")
90
91And in the memeber's folder
92
93    >>> self.addDocument(self.folder, "doc1", "Member Document 1 text")
94    >>> self.addDocument(self.folder, "doc2", "Member Document 2 text")
95
[2997]96We need to add sitemap for demonstration.
[2947]97
98    >>> browser.open(portal_url + "/prefs_gsm_settings")
99    >>> browser.getControl('Add Content Sitemap').click()
100   
[3005]101Now we are landed on the newly-created sitemap edit form.
102What we are interested in is "Blackout entries" field on the edit
103form, it should be empty by default settings.
[2997]104
[2949]105    >>> file("/tmp/browser.0.html","wb").write(browser.contents)
[2947]106    >>> blackout_list = browser.getControl("Blackout entries")
107    >>> blackout_list
108    <Control name='blackout_list:lines' type='textarea'>
[2949]109    >>> blackout_list.value == ""
110    True
[2948]111    >>> save_button = browser.getControl("Save")
[2947]112    >>> save_button
[2992]113    <SubmitControl name='form...' type='submit'>
[2948]114    >>> save_button.click()
[2947]115
116
[3005]117Clicking on "Save" button will lead us to the sitemap view.
[2947]118
[2950]119    >>> print browser.contents
120    <?xml version="1.0" encoding=...
[2947]121
[2950]122
[3005]123"sitemap.xml" link should appear on "Settings" page of the
124Plone Google Sitemap configlet after "Content Sitemap"
[2997]125was added.
[2947]126
[2949]127    >>> browser.open(portal_url + "/prefs_gsm_settings")
128    >>> smedit_link = browser.getLink('sitemap.xml')
[2950]129    >>> smedit_url = smedit_link.url
[2947]130
[3005]131This link points to the newly-created sitemap.xml edit form.
132Let's prepare view link to simplify the following demonstrations.
[2947]133
[2950]134    >>> smedit_url.endswith("sitemap.xml/edit")
[2949]135    True
[2950]136    >>> smview_url = smedit_url[:-5]
[2949]137
138
139No filters
140==========
141
[3005]142The created sitemap has no filters applied and all documents should appear in it.
[2949]143
[2950]144    >>> browser.open(smview_url)
[2949]145    >>> file("/tmp/browser.1.html","wb").write(browser.contents)
146    >>> no_filters_content = browser.contents
147
[3005]148Check if result page is really a sitemap...
[2949]149
[2950]150    >>> print browser.contents
151    <?xml version="1.0" encoding=...
[2949]152
[2950]153
[3005]154Create regular expression, which will help us to test which urls pass the filters.
[2949]155
[3163]156    >>> import re
[2949]157    >>> reloc = re.compile("<loc>%s([^\<]*)</loc>" % self.portal.absolute_url(), re.S)
158
[3005]159Test if all 4 documents and default front-page are in the sitemap without filters.
[2949]160
161    >>> no_filters_res = reloc.findall(no_filters_content)
162    >>> no_filters_res.sort()
163    >>> print "\n".join(no_filters_res)
164    /Members/test_user_1_/doc1
165    /Members/test_user_1_/doc2
166    /doc1
167    /doc2
168    /front-page
169
170
171Check "id" filter
172=================
173
[3005]174Go to the sitemap edit form and add "doc1" and "front-page" lines with "id:"
175prefix to the "Blackout entries" field.
[2949]176
[2950]177    >>> browser.open(smedit_url)
[2949]178    >>> filtercontrol = browser.getControl("Blackout entries")
[2952]179    >>> filtercontrol.value = """
180    ...     id:doc1
181    ...     id:front-page
182    ... """
[2949]183    >>> browser.getControl("Save").click()
184    >>> id_filter_content = browser.contents
185
[3005]186"doc1" and "front-page" documents should now be excluded from the
[2997]187sitemap.
[2949]188
189    >>> id_filter_res = reloc.findall(id_filter_content)
190    >>> id_filter_res.sort()
191    >>> print "\n".join(id_filter_res)
192    /Members/test_user_1_/doc2
193    /doc2
194
195
196Check "path" filter
197===================
198
[3005]199Suppose we want to exclude "front_page" from portal root and "doc2"
200document, located in test_user_1_ home folder, but leave "doc2"
201untouched in portal root with all other objects.
[2949]202
[2950]203    >>> browser.open(smedit_url)
[2949]204    >>> filtercontrol = browser.getControl("Blackout entries")
[2952]205    >>> filtercontrol.value = """
206    ...    path:/Members/test_user_1_/doc2
207    ...    path:/front-page
208    ... """
[2949]209    >>> browser.getControl("Save").click()
210    >>> path_filter_content = browser.contents
211
[2997]212"/Members/test_user_1_/doc2" and "/front_page" objects should
213be excluded from the sitemap.
[2949]214
215    >>> path_filter_res = reloc.findall(path_filter_content)
216    >>> path_filter_res.sort()
217    >>> print "\n".join(path_filter_res)
[3000]218    /Members/test_user_1_/doc1
[2949]219    /doc1
220    /doc2
221
222
223Check default filter
224====================
225
[3005]226Now I have a question: "What filter will be used when no
227filter name prefix is specified (e.g. old-fashion filters)?"
[2949]228
[3005]229Go to the sitemap edit form and add "doc1" and "front-page"
230lines without any filter name prefix to the "Blackout entries"
231field.
[2949]232
233    >>> browser.open(portal_url + "/sitemap.xml/edit")
234    >>> filtercontrol = browser.getControl("Blackout entries")
[2952]235    >>> filtercontrol.value = """
236    ...     doc1
237    ...     front-page
238    ... """
[2949]239    >>> browser.getControl("Save").click()
240    >>> default_filter_content = browser.contents
241
[3005]242"id" filter must be used as default filter. So, all "doc1" and
[2997]243"front-page" objects should be excluded from the sitemap.
[2949]244
245    >>> default_filter_res = reloc.findall(default_filter_content)
246    >>> default_filter_res.sort()
247    >>> print "\n".join(default_filter_res)
248    /Members/test_user_1_/doc2
249    /doc2
250
251
[3005]252Create your own filters
253=======================
[2951]254
[3005]255Suppose we want to create our own blackout filter,  which will
256behave like id-filter, but will have some differences. Our fitler
257has the following format:
[2951]258
259  (+|-)<filtered id>
260
[3005]261- if the 1st sign is "+" then only objects with <filtered id>
262  should be left in sitemap after filetering;
263- if the 1st sign is "-" then all objects with <filtered id>
264  should be excluded from the sitemap (like default id filter).
[2951]265
[3005]266You need to create new IBlckoutFilter multi-adapter, and register
267it with unique name.
[2951]268
269    >>> from zope.component import adapts
270    >>> from zope.interface import Interface, implements
271    >>> from zope.publisher.interfaces.browser import IBrowserRequest
272    >>> from quintagroup.plonegooglesitemaps.interfaces import IBlackoutFilter
273    >>> class SignedIdFilter(object):
274    ...     adapts(Interface, IBrowserRequest)
275    ...     implements(IBlackoutFilter)
276    ...     def __init__(self, context, request):
277    ...         self.context = context
278    ...         self.request = request
279    ...     def filterOut(self, fdata, fargs):
280    ...         sign = fargs[0]
281    ...         fid = fargs[1:]
282    ...         if sign == "+":
283    ...             return [b for b in fdata if b.getId==fid]
284    ...         elif sign == "-":
285    ...             return [b for b in fdata if b.getId!=fid]
286    ...         return fdata
287
288
289Now register this new filter as named multiadapter ...
290
291    >>> from zope.component import provideAdapter
292    >>> provideAdapter(SignedIdFilter,
293    ...                name=u'signedid')
294
[3005]295So that's all what needed to add new filter. Now test newly-created
296filter.
[2951]297
[2997]298Check whether white filtering ("+" prefix) works correctly.
[3005]299Go to the sitemap edit form and add "signedid:+doc1"
[2951]300to the "Blackout entries" field.
301
302    >>> browser.open(smedit_url)
303    >>> filtercontrol = browser.getControl("Blackout entries")
[2952]304    >>> filtercontrol.value = """
305    ...    signedid:+doc1
306    ... """
[2951]307    >>> browser.getControl("Save").click()
308    >>> signedid_filter_content = browser.contents
309
[3005]310Only objects with "doc1" id should be left in the sitemap.
[2951]311
312    >>> signedid_filter_res = reloc.findall(signedid_filter_content)
313    >>> signedid_filter_res.sort()
314    >>> print "\n".join(signedid_filter_res)
315    /Members/test_user_1_/doc1
316    /doc1
317
318
[3005]319Finally, check whether black filtering ("-" prefix) works correctly.
320Go to the sitemaps edit form and add "signedid:-doc1" to the "Blackout
321entries" field.
[2951]322
323    >>> browser.open(smedit_url)
324    >>> filtercontrol = browser.getControl("Blackout entries")
[2952]325    >>> filtercontrol.value = """
326    ...     signedid:-doc1
327    ... """
[2951]328    >>> browser.getControl("Save").click()
329    >>> signedid_filter_content = browser.contents
330
[3005]331All objects, except those having "doc1" id, must be included in
[2997]332the sitemap.
[2951]333
334    >>> signedid_filter_res = reloc.findall(signedid_filter_content)
335    >>> signedid_filter_res.sort()
336    >>> print "\n".join(signedid_filter_res)
337    /Members/test_user_1_/doc2
338    /doc2
339    /front-page
Note: See TracBrowser for help on using the repository browser.