source: products/quintagroup.plonegooglesitemaps/trunk/quintagroup/plonegooglesitemaps/filters.txt @ 3163

Last change on this file since 3163 was 3163, checked in by zidane, 13 years ago

fixes pyflakes and pylint

  • Property svn:eol-style set to native
File size: 10.5 KB
Line 
1
2Blackout filtering
3==================
4
5Introduction
6============
7
8Sitemap portal type has an option that filters objects that
9should be excluded from a sitemap. This option is accessable
10on sitemap edit form and is labeled as "Blackout entries".
11
12In earlier versions of the package (<4.0.1 for plone-4 branch
13and <3.0.7 for plone-3 branch) this field allowed to
14filter objects only by their ids, and it looked like:
15
16<pre>
17  index.html
18  index_html
19</pre>
20
21As a result, all objects with "index.html" or "index_html" ids
22were excluded from the sitemap.
23
24In the new versions of GoogleSitemaps filtering was refactored
25to pluggable architecture. Now filters turned to be named multi
26adapters. There are only two default filters: "id" and "path".
27
28Since different filters can be used - new syntax was applied
29to the "Blackout entries" field. Every record in the field
30should follow the specification:
31 
32  [<filter name>:]<filter arguments>
33
34* If no <filter name> is specified - "id" filter will be used.
35* If <filter name> is specified - system will look for
36  <filter name>-named  multiadapter to IBlackoutFilter interface.
37  If such multiadapter is not found - filter ill be ignored without
38  raising any errors.
39
40The following parts demonstrate how to work with filtering.
41Aspects of default filters ("id" and "path") will also be
42considered.
43
44Demonstration environment setup
45===============================
46
47First, we have to do some setup. We use testbrowser that is
48shipped with Five, as this provides proper Zope 2 integration. Most
49of the documentation, though, is in the underlying zope.testbrower
50package.
51
52    >>> from Products.Five.testbrowser import Browser
53    >>> browser = Browser()
54    >>> portal_url = self.portal.absolute_url()
55
56This is useful when writing and debugging testbrowser tests. It lets
57us see all error messages in the error_log.
58
59    >>> self.portal.error_log._ignored_exceptions = ()
60
61With that in place, we can go to the portal front page and log in.
62We will do this using the default user from PloneTestCase:
63
64    >>> from Products.PloneTestCase.setup import portal_owner, default_password
65    >>> browser.open(portal_url)
66
67We have the login portlet, so let's use that.
68
69    >>> browser.open('http://nohost/plone/login_form')
70    >>> browser.getLink('Log in').click()
71    >>> browser.url
72    'http://nohost/plone/login_form'
73    >>> browser.getControl('Login Name').value = portal_owner
74    >>> browser.getControl('Password').value = default_password
75    >>> browser.getControl('Log in').click()
76    >>> "You are now logged in" in browser.contents
77    True
78    >>> "Login failed" in browser.contents
79    False
80    >>> browser.url
81    'http://nohost/plone/login_form'
82
83
84Functionality
85=============
86
87First, create some content for demonstration purpose.
88
89In the root of the portal
90
91    >>> self.addDocument(self.portal, "doc1", "Document 1 text")
92    >>> self.addDocument(self.portal, "doc2", "Document 2 text")
93
94And in the memeber's folder
95
96    >>> self.addDocument(self.folder, "doc1", "Member Document 1 text")
97    >>> self.addDocument(self.folder, "doc2", "Member Document 2 text")
98
99We need to add sitemap for demonstration.
100
101    >>> browser.open(portal_url + "/prefs_gsm_settings")
102    >>> browser.getControl('Add Content Sitemap').click()
103   
104Now we are landed on the newly-created sitemap edit form.
105What we are interested in is "Blackout entries" field on the edit
106form, it should be empty by default settings.
107
108    >>> file("/tmp/browser.0.html","wb").write(browser.contents)
109    >>> blackout_list = browser.getControl("Blackout entries")
110    >>> blackout_list
111    <Control name='blackout_list:lines' type='textarea'>
112    >>> blackout_list.value == ""
113    True
114    >>> save_button = browser.getControl("Save")
115    >>> save_button
116    <SubmitControl name='form...' type='submit'>
117    >>> save_button.click()
118
119
120Clicking on "Save" button will lead us to the sitemap view.
121
122    >>> print browser.contents
123    <?xml version="1.0" encoding=...
124
125
126"sitemap.xml" link should appear on "Settings" page of the
127Plone Google Sitemap configlet after "Content Sitemap"
128was added.
129
130    >>> browser.open(portal_url + "/prefs_gsm_settings")
131    >>> smedit_link = browser.getLink('sitemap.xml')
132    >>> smedit_url = smedit_link.url
133
134This link points to the newly-created sitemap.xml edit form.
135Let's prepare view link to simplify the following demonstrations.
136
137    >>> smedit_url.endswith("sitemap.xml/edit")
138    True
139    >>> smview_url = smedit_url[:-5]
140
141
142No filters
143==========
144
145The created sitemap has no filters applied and all documents should appear in it.
146
147    >>> browser.open(smview_url)
148    >>> file("/tmp/browser.1.html","wb").write(browser.contents)
149    >>> no_filters_content = browser.contents
150
151Check if result page is really a sitemap...
152
153    >>> print browser.contents
154    <?xml version="1.0" encoding=...
155
156
157Create regular expression, which will help us to test which urls pass the filters.
158
159    >>> import re
160    >>> reloc = re.compile("<loc>%s([^\<]*)</loc>" % self.portal.absolute_url(), re.S)
161
162Test if all 4 documents and default front-page are in the sitemap without filters.
163
164    >>> no_filters_res = reloc.findall(no_filters_content)
165    >>> no_filters_res.sort()
166    >>> print "\n".join(no_filters_res)
167    /Members/test_user_1_/doc1
168    /Members/test_user_1_/doc2
169    /doc1
170    /doc2
171    /front-page
172
173
174Check "id" filter
175=================
176
177Go to the sitemap edit form and add "doc1" and "front-page" lines with "id:"
178prefix to the "Blackout entries" field.
179
180    >>> browser.open(smedit_url)
181    >>> filtercontrol = browser.getControl("Blackout entries")
182    >>> filtercontrol.value = """
183    ...     id:doc1
184    ...     id:front-page
185    ... """
186    >>> browser.getControl("Save").click()
187    >>> id_filter_content = browser.contents
188
189"doc1" and "front-page" documents should now be excluded from the
190sitemap.
191
192    >>> id_filter_res = reloc.findall(id_filter_content)
193    >>> id_filter_res.sort()
194    >>> print "\n".join(id_filter_res)
195    /Members/test_user_1_/doc2
196    /doc2
197
198
199Check "path" filter
200===================
201
202Suppose we want to exclude "front_page" from portal root and "doc2"
203document, located in test_user_1_ home folder, but leave "doc2"
204untouched in portal root with all other objects.
205
206    >>> browser.open(smedit_url)
207    >>> filtercontrol = browser.getControl("Blackout entries")
208    >>> filtercontrol.value = """
209    ...    path:/Members/test_user_1_/doc2
210    ...    path:/front-page
211    ... """
212    >>> browser.getControl("Save").click()
213    >>> path_filter_content = browser.contents
214
215"/Members/test_user_1_/doc2" and "/front_page" objects should
216be excluded from the sitemap.
217
218    >>> path_filter_res = reloc.findall(path_filter_content)
219    >>> path_filter_res.sort()
220    >>> print "\n".join(path_filter_res)
221    /Members/test_user_1_/doc1
222    /doc1
223    /doc2
224
225
226Check default filter
227====================
228
229Now I have a question: "What filter will be used when no
230filter name prefix is specified (e.g. old-fashion filters)?"
231
232Go to the sitemap edit form and add "doc1" and "front-page"
233lines without any filter name prefix to the "Blackout entries"
234field.
235
236    >>> browser.open(portal_url + "/sitemap.xml/edit")
237    >>> filtercontrol = browser.getControl("Blackout entries")
238    >>> filtercontrol.value = """
239    ...     doc1
240    ...     front-page
241    ... """
242    >>> browser.getControl("Save").click()
243    >>> default_filter_content = browser.contents
244
245"id" filter must be used as default filter. So, all "doc1" and
246"front-page" objects should be excluded from the sitemap.
247
248    >>> default_filter_res = reloc.findall(default_filter_content)
249    >>> default_filter_res.sort()
250    >>> print "\n".join(default_filter_res)
251    /Members/test_user_1_/doc2
252    /doc2
253
254
255Create your own filters
256=======================
257
258Suppose we want to create our own blackout filter,  which will
259behave like id-filter, but will have some differences. Our fitler
260has the following format:
261
262  (+|-)<filtered id>
263
264- if the 1st sign is "+" then only objects with <filtered id>
265  should be left in sitemap after filetering;
266- if the 1st sign is "-" then all objects with <filtered id>
267  should be excluded from the sitemap (like default id filter).
268
269You need to create new IBlckoutFilter multi-adapter, and register
270it with unique name.
271
272    >>> from zope.component import adapts
273    >>> from zope.interface import Interface, implements
274    >>> from zope.publisher.interfaces.browser import IBrowserRequest
275    >>> from quintagroup.plonegooglesitemaps.interfaces import IBlackoutFilter
276    >>> class SignedIdFilter(object):
277    ...     adapts(Interface, IBrowserRequest)
278    ...     implements(IBlackoutFilter)
279    ...     def __init__(self, context, request):
280    ...         self.context = context
281    ...         self.request = request
282    ...     def filterOut(self, fdata, fargs):
283    ...         sign = fargs[0]
284    ...         fid = fargs[1:]
285    ...         if sign == "+":
286    ...             return [b for b in fdata if b.getId==fid]
287    ...         elif sign == "-":
288    ...             return [b for b in fdata if b.getId!=fid]
289    ...         return fdata
290
291
292Now register this new filter as named multiadapter ...
293
294    >>> from zope.component import provideAdapter
295    >>> provideAdapter(SignedIdFilter,
296    ...                name=u'signedid')
297
298So that's all what needed to add new filter. Now test newly-created
299filter.
300
301Check whether white filtering ("+" prefix) works correctly.
302Go to the sitemap edit form and add "signedid:+doc1"
303to the "Blackout entries" field.
304
305    >>> browser.open(smedit_url)
306    >>> filtercontrol = browser.getControl("Blackout entries")
307    >>> filtercontrol.value = """
308    ...    signedid:+doc1
309    ... """
310    >>> browser.getControl("Save").click()
311    >>> signedid_filter_content = browser.contents
312
313Only objects with "doc1" id should be left in the sitemap.
314
315    >>> signedid_filter_res = reloc.findall(signedid_filter_content)
316    >>> signedid_filter_res.sort()
317    >>> print "\n".join(signedid_filter_res)
318    /Members/test_user_1_/doc1
319    /doc1
320
321
322Finally, check whether black filtering ("-" prefix) works correctly.
323Go to the sitemaps edit form and add "signedid:-doc1" to the "Blackout
324entries" field.
325
326    >>> browser.open(smedit_url)
327    >>> filtercontrol = browser.getControl("Blackout entries")
328    >>> filtercontrol.value = """
329    ...     signedid:-doc1
330    ... """
331    >>> browser.getControl("Save").click()
332    >>> signedid_filter_content = browser.contents
333
334All objects, except those having "doc1" id, must be included in
335the sitemap.
336
337    >>> signedid_filter_res = reloc.findall(signedid_filter_content)
338    >>> signedid_filter_res.sort()
339    >>> print "\n".join(signedid_filter_res)
340    /Members/test_user_1_/doc2
341    /doc2
342    /front-page
Note: See TracBrowser for help on using the repository browser.