source: products/quintagroup.plonegooglesitemaps/trunk/quintagroup/plonegooglesitemaps/filters.txt @ 3247

Last change on this file since 3247 was 3247, checked in by zidane, 9 years ago

added plone4.1 compatibility

  • Property svn:eol-style set to native
File size: 10.4 KB
Line 
1
2Blackout filtering
3==================
4
5Introduction
6============
7
8Sitemap portal type has an option that filters objects that
9should be excluded from a sitemap. This option is accessable
10on sitemap edit form and is labeled as "Blackout entries".
11
12In earlier versions of the package (<4.0.1 for plone-4 branch
13and <3.0.7 for plone-3 branch) this field allowed to
14filter objects only by their ids, and it looked like:
15
16<pre>
17  index.html
18  index_html
19</pre>
20
21As a result, all objects with "index.html" or "index_html" ids
22were excluded from the sitemap.
23
24In the new versions of GoogleSitemaps filtering was refactored
25to pluggable architecture. Now filters turned to be named multi
26adapters. There are only two default filters: "id" and "path".
27
28Since different filters can be used - new syntax was applied
29to the "Blackout entries" field. Every record in the field
30should follow the specification:
31 
32  [<filter name>:]<filter arguments>
33
34* If no <filter name> is specified - "id" filter will be used.
35* If <filter name> is specified - system will look for
36  <filter name>-named  multiadapter to IBlackoutFilter interface.
37  If such multiadapter is not found - filter ill be ignored without
38  raising any errors.
39
40The following parts demonstrate how to work with filtering.
41Aspects of default filters ("id" and "path") will also be
42considered.
43
44Demonstration environment setup
45===============================
46
47First, we have to do some setup. We use testbrowser that is
48shipped with Five, as this provides proper Zope 2 integration. Most
49of the documentation, though, is in the underlying zope.testbrower
50package.
51
52    >>> from Products.Five.testbrowser import Browser
53    >>> browser = Browser()
54    >>> portal_url = self.portal.absolute_url()
55
56This is useful when writing and debugging testbrowser tests. It lets
57us see all error messages in the error_log.
58
59    >>> self.portal.error_log._ignored_exceptions = ()
60
61With that in place, we can go to the portal front page and log in.
62We will do this using the default user from PloneTestCase:
63
64    >>> from Products.PloneTestCase.setup import portal_owner, default_password
65    >>> browser.open(portal_url)
66
67We have the login portlet, so let's use that.
68
69    >>> browser.open('http://nohost/plone/login_form')
70    >>> browser.getControl('Login Name').value = portal_owner
71    >>> browser.getControl('Password').value = default_password
72    >>> browser.getControl('Log in').click()
73    >>> "You are now logged in" in browser.contents
74    True
75    >>> "Login failed" in browser.contents
76    False
77    >>> browser.url
78    'http://nohost/plone/login_form'
79
80
81Functionality
82=============
83
84First, create some content for demonstration purpose.
85
86In the root of the portal
87
88    >>> self.addDocument(self.portal, "doc1", "Document 1 text")
89    >>> self.addDocument(self.portal, "doc2", "Document 2 text")
90
91And in the memeber's folder
92
93    >>> self.addDocument(self.folder, "doc1", "Member Document 1 text")
94    >>> self.addDocument(self.folder, "doc2", "Member Document 2 text")
95
96We need to add sitemap for demonstration.
97
98    >>> browser.open(portal_url + "/prefs_gsm_settings")
99    >>> browser.getControl('Add Content Sitemap').click()
100   
101Now we are landed on the newly-created sitemap edit form.
102What we are interested in is "Blackout entries" field on the edit
103form, it should be empty by default settings.
104
105    >>> file("/tmp/browser.0.html","wb").write(browser.contents)
106    >>> blackout_list = browser.getControl("Blackout entries")
107    >>> blackout_list
108    <Control name='blackout_list:lines' type='textarea'>
109    >>> blackout_list.value == ""
110    True
111    >>> save_button = browser.getControl("Save")
112    >>> save_button
113    <SubmitControl name='form...' type='submit'>
114    >>> save_button.click()
115
116
117Clicking on "Save" button will lead us to the sitemap view.
118
119    >>> print browser.contents
120    <?xml version="1.0" encoding=...
121
122
123"sitemap.xml" link should appear on "Settings" page of the
124Plone Google Sitemap configlet after "Content Sitemap"
125was added.
126
127    >>> browser.open(portal_url + "/prefs_gsm_settings")
128    >>> smedit_link = browser.getLink('sitemap.xml')
129    >>> smedit_url = smedit_link.url
130
131This link points to the newly-created sitemap.xml edit form.
132Let's prepare view link to simplify the following demonstrations.
133
134    >>> smedit_url.endswith("sitemap.xml/edit")
135    True
136    >>> smview_url = smedit_url[:-5]
137
138
139No filters
140==========
141
142The created sitemap has no filters applied and all documents should appear in it.
143
144    >>> browser.open(smview_url)
145    >>> file("/tmp/browser.1.html","wb").write(browser.contents)
146    >>> no_filters_content = browser.contents
147
148Check if result page is really a sitemap...
149
150    >>> print browser.contents
151    <?xml version="1.0" encoding=...
152
153
154Create regular expression, which will help us to test which urls pass the filters.
155
156    >>> import re
157    >>> reloc = re.compile("<loc>%s([^\<]*)</loc>" % self.portal.absolute_url(), re.S)
158
159Test if all 4 documents and default front-page are in the sitemap without filters.
160
161    >>> no_filters_res = reloc.findall(no_filters_content)
162    >>> no_filters_res.sort()
163    >>> print "\n".join(no_filters_res)
164    /Members/test_user_1_/doc1
165    /Members/test_user_1_/doc2
166    /doc1
167    /doc2
168    /front-page
169
170
171Check "id" filter
172=================
173
174Go to the sitemap edit form and add "doc1" and "front-page" lines with "id:"
175prefix to the "Blackout entries" field.
176
177    >>> browser.open(smedit_url)
178    >>> filtercontrol = browser.getControl("Blackout entries")
179    >>> filtercontrol.value = """
180    ...     id:doc1
181    ...     id:front-page
182    ... """
183    >>> browser.getControl("Save").click()
184    >>> id_filter_content = browser.contents
185
186"doc1" and "front-page" documents should now be excluded from the
187sitemap.
188
189    >>> id_filter_res = reloc.findall(id_filter_content)
190    >>> id_filter_res.sort()
191    >>> print "\n".join(id_filter_res)
192    /Members/test_user_1_/doc2
193    /doc2
194
195
196Check "path" filter
197===================
198
199Suppose we want to exclude "front_page" from portal root and "doc2"
200document, located in test_user_1_ home folder, but leave "doc2"
201untouched in portal root with all other objects.
202
203    >>> browser.open(smedit_url)
204    >>> filtercontrol = browser.getControl("Blackout entries")
205    >>> filtercontrol.value = """
206    ...    path:/Members/test_user_1_/doc2
207    ...    path:/front-page
208    ... """
209    >>> browser.getControl("Save").click()
210    >>> path_filter_content = browser.contents
211
212"/Members/test_user_1_/doc2" and "/front_page" objects should
213be excluded from the sitemap.
214
215    >>> path_filter_res = reloc.findall(path_filter_content)
216    >>> path_filter_res.sort()
217    >>> print "\n".join(path_filter_res)
218    /Members/test_user_1_/doc1
219    /doc1
220    /doc2
221
222
223Check default filter
224====================
225
226Now I have a question: "What filter will be used when no
227filter name prefix is specified (e.g. old-fashion filters)?"
228
229Go to the sitemap edit form and add "doc1" and "front-page"
230lines without any filter name prefix to the "Blackout entries"
231field.
232
233    >>> browser.open(portal_url + "/sitemap.xml/edit")
234    >>> filtercontrol = browser.getControl("Blackout entries")
235    >>> filtercontrol.value = """
236    ...     doc1
237    ...     front-page
238    ... """
239    >>> browser.getControl("Save").click()
240    >>> default_filter_content = browser.contents
241
242"id" filter must be used as default filter. So, all "doc1" and
243"front-page" objects should be excluded from the sitemap.
244
245    >>> default_filter_res = reloc.findall(default_filter_content)
246    >>> default_filter_res.sort()
247    >>> print "\n".join(default_filter_res)
248    /Members/test_user_1_/doc2
249    /doc2
250
251
252Create your own filters
253=======================
254
255Suppose we want to create our own blackout filter,  which will
256behave like id-filter, but will have some differences. Our fitler
257has the following format:
258
259  (+|-)<filtered id>
260
261- if the 1st sign is "+" then only objects with <filtered id>
262  should be left in sitemap after filetering;
263- if the 1st sign is "-" then all objects with <filtered id>
264  should be excluded from the sitemap (like default id filter).
265
266You need to create new IBlckoutFilter multi-adapter, and register
267it with unique name.
268
269    >>> from zope.component import adapts
270    >>> from zope.interface import Interface, implements
271    >>> from zope.publisher.interfaces.browser import IBrowserRequest
272    >>> from quintagroup.plonegooglesitemaps.interfaces import IBlackoutFilter
273    >>> class SignedIdFilter(object):
274    ...     adapts(Interface, IBrowserRequest)
275    ...     implements(IBlackoutFilter)
276    ...     def __init__(self, context, request):
277    ...         self.context = context
278    ...         self.request = request
279    ...     def filterOut(self, fdata, fargs):
280    ...         sign = fargs[0]
281    ...         fid = fargs[1:]
282    ...         if sign == "+":
283    ...             return [b for b in fdata if b.getId==fid]
284    ...         elif sign == "-":
285    ...             return [b for b in fdata if b.getId!=fid]
286    ...         return fdata
287
288
289Now register this new filter as named multiadapter ...
290
291    >>> from zope.component import provideAdapter
292    >>> provideAdapter(SignedIdFilter,
293    ...                name=u'signedid')
294
295So that's all what needed to add new filter. Now test newly-created
296filter.
297
298Check whether white filtering ("+" prefix) works correctly.
299Go to the sitemap edit form and add "signedid:+doc1"
300to the "Blackout entries" field.
301
302    >>> browser.open(smedit_url)
303    >>> filtercontrol = browser.getControl("Blackout entries")
304    >>> filtercontrol.value = """
305    ...    signedid:+doc1
306    ... """
307    >>> browser.getControl("Save").click()
308    >>> signedid_filter_content = browser.contents
309
310Only objects with "doc1" id should be left in the sitemap.
311
312    >>> signedid_filter_res = reloc.findall(signedid_filter_content)
313    >>> signedid_filter_res.sort()
314    >>> print "\n".join(signedid_filter_res)
315    /Members/test_user_1_/doc1
316    /doc1
317
318
319Finally, check whether black filtering ("-" prefix) works correctly.
320Go to the sitemaps edit form and add "signedid:-doc1" to the "Blackout
321entries" field.
322
323    >>> browser.open(smedit_url)
324    >>> filtercontrol = browser.getControl("Blackout entries")
325    >>> filtercontrol.value = """
326    ...     signedid:-doc1
327    ... """
328    >>> browser.getControl("Save").click()
329    >>> signedid_filter_content = browser.contents
330
331All objects, except those having "doc1" id, must be included in
332the sitemap.
333
334    >>> signedid_filter_res = reloc.findall(signedid_filter_content)
335    >>> signedid_filter_res.sort()
336    >>> print "\n".join(signedid_filter_res)
337    /Members/test_user_1_/doc2
338    /doc2
339    /front-page
Note: See TracBrowser for help on using the repository browser.