source: products/quintagroup.plonegooglesitemaps/branches/blacklist/quintagroup/plonegooglesitemaps/filters.txt @ 2997

Last change on this file since 2997 was 2997, checked in by mylan, 13 years ago

#228: reviewed explanation, correct grammar for filters doctest

  • Property svn:eol-style set to native
File size: 10.5 KB
Line 
1
2Blackout filtering
3==================
4
5Filtering introductioin
6=======================
7
8Sitemap portal type has an option, designed to filter out
9objects which should be excluded from a sitemap. This option is
10accessable in sitemap edit form and is labeled as
11"Blackout entries".
12
13In earlier versions of the package (<4.0.1 for plone-4 branch
14and <3.0.7 for plone-3 branch) this field allowed to
15filter objects only by their ids, and looked like:
16
17<pre>
18  index.html
19  index_html
20</pre>
21
22So, all objects with "index.html" or "index_html" ids were
23excluded from the sitemap.
24
25In the new versions of GoogleSitemaps filtering was remade
26to pluggable architecture. Now filters became named multi
27adapters. There are only two default filters - "id" and
28"path".
29
30Since different filters can be used - new syntax was applied
31to the "Blackout entries" field. Every record in the field
32should follow the specification:
33 
34  [<filter name>:]<filter arguments>
35
36If no <filter name> is specified - "id" filter will
37be used. If <filter name> is specified - system will look
38for <filter name>-named  multiadapter to IBlackoutFilter
39interface. If such multiadapter is not found - filter
40will be ignored without raising any errors.
41
42
43Following parts demonstrate working of the filtering.
44Aspects of default filters ("id" and "path")
45are considered yet.
46
47Setup demonstration environment
48===============================
49
50First, we must perform some setup. We use the testbrowser that is
51shipped with Five, as this provides proper Zope 2 integration. Most
52of the documentation, though, is in the underlying zope.testbrower
53package.
54
55    >>> from Products.Five.testbrowser import Browser
56    >>> browser = Browser()
57    >>> portal_url = self.portal.absolute_url()
58
59The following is useful when writing and debugging testbrowser tests.
60It lets us see all error messages in the error_log.
61
62    >>> self.portal.error_log._ignored_exceptions = ()
63
64With that in place, we can go to the portal front page and log in.
65We will do this using the default user from PloneTestCase:
66
67    >>> from Products.PloneTestCase.setup import portal_owner, default_password
68    >>> browser.open(portal_url)
69
70We have the login portlet, so let's use that.
71
72    >>> browser.open('http://nohost/plone/login_form')
73    >>> browser.getLink('Log in').click()
74    >>> browser.url
75    'http://nohost/plone/login_form'
76    >>> browser.getControl('Login Name').value = portal_owner
77    >>> browser.getControl('Password').value = default_password
78    >>> browser.getControl('Log in').click()
79    >>> "You are now logged in" in browser.contents
80    True
81    >>> "Login failed" in browser.contents
82    False
83    >>> browser.url
84    'http://nohost/plone/login_form'
85
86
87Functionality
88=============
89
90First create some content for demonstrations.
91
92In the root of the portal
93
94    >>> self.addDocument(self.portal, "doc1", "Document 1 text")
95    >>> self.addDocument(self.portal, "doc2", "Document 2 text")
96
97And in the memeber's folder
98
99    >>> self.addDocument(self.folder, "doc1", "Member Document 1 text")
100    >>> self.addDocument(self.folder, "doc2", "Member Document 2 text")
101
102We need to add sitemap for demonstration.
103
104    >>> browser.open(portal_url + "/prefs_gsm_settings")
105    >>> browser.getControl('Add Content Sitemap').click()
106   
107Now we bring-up to edit form of the newly created content sitemap.
108We are interested in two things: "Blackout entries" field must
109present in the form and by default it should be empty.
110
111    >>> file("/tmp/browser.0.html","wb").write(browser.contents)
112    >>> blackout_list = browser.getControl("Blackout entries")
113    >>> blackout_list
114    <Control name='blackout_list:lines' type='textarea'>
115    >>> blackout_list.value == ""
116    True
117    >>> save_button = browser.getControl("Save")
118    >>> save_button
119    <SubmitControl name='form...' type='submit'>
120    >>> save_button.click()
121
122
123Click on "Save" button lead us to result sitemap view.
124
125    >>> print browser.contents
126    <?xml version="1.0" encoding=...
127
128
129"sitemap.xml" link should appear in "Settings" page of the
130Plone Google Sitemap configlet when "Content Sitemap"
131was added.
132
133    >>> browser.open(portal_url + "/prefs_gsm_settings")
134    >>> smedit_link = browser.getLink('sitemap.xml')
135    >>> smedit_url = smedit_link.url
136
137This link points to edit form of the newly created sitemap.xml.
138Let prepare view link to simplifier following demonstrations.
139
140    >>> smedit_url.endswith("sitemap.xml/edit")
141    True
142    >>> smview_url = smedit_url[:-5]
143
144
145No filters
146==========
147
148Created sitemap has no filters and all documents should appear in it.
149
150    >>> browser.open(smview_url)
151    >>> file("/tmp/browser.1.html","wb").write(browser.contents)
152    >>> no_filters_content = browser.contents
153
154Check if resulted page really is sitemap...
155
156    >>> print browser.contents
157    <?xml version="1.0" encoding=...
158
159
160Create regular expression, which help us to test which urls pass the filters.
161
162    >>> reloc = re.compile("<loc>%s([^\<]*)</loc>" % self.portal.absolute_url(), re.S)
163
164Test if all 4 documents and default front-page present in the sitemap
165without filters.
166
167    >>> no_filters_res = reloc.findall(no_filters_content)
168    >>> no_filters_res.sort()
169    >>> print "\n".join(no_filters_res)
170    /Members/test_user_1_/doc1
171    /Members/test_user_1_/doc2
172    /doc1
173    /doc2
174    /front-page
175
176
177Check "id" filter
178=================
179
180Go to the edit form of the sitemap and add "doc1"
181and "front-page" lines with "id:" prefix to the
182"Blackout entries" field.
183
184    >>> browser.open(smedit_url)
185    >>> filtercontrol = browser.getControl("Blackout entries")
186    >>> filtercontrol.value = """
187    ...     id:doc1
188    ...     id:front-page
189    ... """
190    >>> browser.getControl("Save").click()
191    >>> id_filter_content = browser.contents
192
193"doc1" and "front-page" documents should be excluded from the
194sitemap.
195
196    >>> id_filter_res = reloc.findall(id_filter_content)
197    >>> id_filter_res.sort()
198    >>> print "\n".join(id_filter_res)
199    /Members/test_user_1_/doc2
200    /doc2
201
202
203Check "path" filter
204===================
205
206Suppouse we wont to exclude the "front_page" from portal root
207and "doc2" document, located in test_user_1_ home folder,
208but leave untouched "doc2" in portal root with all other objects.
209
210    >>> browser.open(smedit_url)
211    >>> filtercontrol = browser.getControl("Blackout entries")
212    >>> filtercontrol.value = """
213    ...    path:/Members/test_user_1_/doc2
214    ...    path:/front-page
215    ... """
216    >>> browser.getControl("Save").click()
217    >>> path_filter_content = browser.contents
218
219"/Members/test_user_1_/doc2" and "/front_page" objects should
220be excluded from the sitemap.
221
222    >>> path_filter_res = reloc.findall(path_filter_content)
223    >>> path_filter_res.sort()
224    >>> print "\n".join(path_filter_res)
225    /Membe rs/test_user_1_/doc1
226    /doc1
227    /doc2
228
229
230Check default filter
231====================
232
233Now I have the question: "What filter will be used when no
234filter name prefix was specified (old-fashion filters for
235example)?"
236
237Go to the edit form of the sitemap and add "doc1" and
238"front-page" lines without any filter name prefix to the
239"Blackout entries" field.
240
241    >>> browser.open(portal_url + "/sitemap.xml/edit")
242    >>> filtercontrol = browser.getControl("Blackout entries")
243    >>> filtercontrol.value = """
244    ...     doc1
245    ...     front-page
246    ... """
247    >>> browser.getControl("Save").click()
248    >>> default_filter_content = browser.contents
249
250"id" filter must be used as default filter. So all "doc1" and
251"front-page" objects should be excluded from the sitemap.
252
253    >>> default_filter_res = reloc.findall(default_filter_content)
254    >>> default_filter_res.sort()
255    >>> print "\n".join(default_filter_res)
256    /Members/test_user_1_/doc2
257    /doc2
258
259
260Creation own filters
261====================
262
263Suppouse we want to create own blackout filter,
264which behave like id-filter, but has some differencies.
265Our fitler has following format:
266
267  (+|-)<filtered id>
268
269  - when 1st sign is "+" then only objects with <filtered id>
270    should be leaved in sitemap after filetering;
271  - if 1st sign is "-" then all objects with <filtered id>
272    should be excluded from the sitemap (like default id
273    filter).
274
275You need create new IBlckoutFilter multi-adapter,
276and register it with unique name.
277
278    >>> from zope.component import adapts
279    >>> from zope.interface import Interface, implements
280    >>> from zope.publisher.interfaces.browser import IBrowserRequest
281    >>> from quintagroup.plonegooglesitemaps.interfaces import IBlackoutFilter
282    >>> class SignedIdFilter(object):
283    ...     adapts(Interface, IBrowserRequest)
284    ...     implements(IBlackoutFilter)
285    ...     def __init__(self, context, request):
286    ...         self.context = context
287    ...         self.request = request
288    ...     def filterOut(self, fdata, fargs):
289    ...         sign = fargs[0]
290    ...         fid = fargs[1:]
291    ...         if sign == "+":
292    ...             return [b for b in fdata if b.getId==fid]
293    ...         elif sign == "-":
294    ...             return [b for b in fdata if b.getId!=fid]
295    ...         return fdata
296
297
298Now register this new filter as named multiadapter ...
299
300    >>> from zope.component import provideAdapter
301    >>> provideAdapter(SignedIdFilter,
302    ...                name=u'signedid')
303
304So that's all what needed to add new filter.
305Now test newly created filter.
306
307Check whether white filtering ("+" prefix) works correctly.
308Go to edit form of the sitemap and add "signedid:+doc1"
309to the "Blackout entries" field.
310
311    >>> browser.open(smedit_url)
312    >>> filtercontrol = browser.getControl("Blackout entries")
313    >>> filtercontrol.value = """
314    ...    signedid:+doc1
315    ... """
316    >>> browser.getControl("Save").click()
317    >>> signedid_filter_content = browser.contents
318
319Only objects with "doc1" id should be leaved in the sitemap.
320
321    >>> signedid_filter_res = reloc.findall(signedid_filter_content)
322    >>> signedid_filter_res.sort()
323    >>> print "\n".join(signedid_filter_res)
324    /Members/test_user_1_/doc1
325    /doc1
326
327
328And for the last - check wheter black filtering ("-" prefix)
329works correctly.
330Go to the edit form of the sitemap and add "signedid:-doc1"
331to the "Blackout entries" field.
332
333    >>> browser.open(smedit_url)
334    >>> filtercontrol = browser.getControl("Blackout entries")
335    >>> filtercontrol.value = """
336    ...     signedid:-doc1
337    ... """
338    >>> browser.getControl("Save").click()
339    >>> signedid_filter_content = browser.contents
340
341All objects except those having "doc1" id must be included in
342the sitemap.
343
344    >>> signedid_filter_res = reloc.findall(signedid_filter_content)
345    >>> signedid_filter_res.sort()
346    >>> print "\n".join(signedid_filter_res)
347    /Members/test_user_1_/doc2
348    /doc2
349    /front-page
Note: See TracBrowser for help on using the repository browser.