source: products/quintagroup.plonegooglesitemaps/branches/sitemap_date/quintagroup/plonegooglesitemaps/filters.txt @ 3565

Last change on this file since 3565 was 3499, checked in by potar, 12 years ago

renamed variables

  • Property svn:eol-style set to native
File size: 10.2 KB
Line 
1
2Blackout filtering
3==================
4
5Introduction
6============
7
8Sitemap portal type has an option that filters objects that
9should be excluded from a sitemap. This option is accessable
10on sitemap edit form and is labeled as "Blackout entries".
11
12In earlier versions of the package (<4.0.1 for plone-4 branch
13and <3.0.7 for plone-3 branch) this field allowed to
14filter objects only by their ids, and it looked like:
15
16<pre>
17  index.html
18  index_html
19</pre>
20
21As a result, all objects with "index.html" or "index_html" ids
22were excluded from the sitemap.
23
24In the new versions of GoogleSitemaps filtering was refactored
25to pluggable architecture. Now filters turned to be named multi
26adapters. There are only two default filters: "id" and "path".
27
28Since different filters can be used - new syntax was applied
29to the "Blackout entries" field. Every record in the field
30should follow the specification:
31 
32  [<filter name>:]<filter arguments>
33
34* If no <filter name> is specified - "id" filter will be used.
35* If <filter name> is specified - system will look for
36  <filter name>-named  multiadapter to IBlackoutFilter interface.
37  If such multiadapter is not found - filter ill be ignored without
38  raising any errors.
39
40The following parts demonstrate how to work with filtering.
41Aspects of default filters ("id" and "path") will also be
42considered.
43
44Demonstration environment setup
45===============================
46
47First, we have to do some setup. We use testbrowser that is
48shipped with Five, as this provides proper Zope 2 integration. Most
49of the documentation, though, is in the underlying zope.testbrower
50package.
51
52    >>> from Products.Five.testbrowser import Browser
53    >>> browser = Browser()
54    >>> portal_url = self.portal.absolute_url()
55
56This is useful when writing and debugging testbrowser tests. It lets
57us see all error messages in the error_log.
58
59    >>> self.portal.error_log._ignored_exceptions = ()
60
61With that in place, we can go to the portal front page and log in.
62We will do this using the default user from PloneTestCase:
63
64    >>> from Products.PloneTestCase.setup import portal_owner, default_password
65    >>> browser.open(portal_url)
66
67We have the login portlet, so let's use that.
68
69    >>> browser.open('http://nohost/plone/login_form')
70    >>> browser.getControl('Login Name').value = portal_owner
71    >>> browser.getControl('Password').value = default_password
72    >>> browser.getControl('Log in').click()
73    >>> "You are now logged in" in browser.contents
74    True
75    >>> "Login failed" in browser.contents
76    False
77    >>> browser.url
78    'http://nohost/plone/login_form'
79
80
81Functionality
82=============
83
84First, create some content for demonstration purpose.
85
86In the root of the portal
87
88    >>> self.addDocument(self.portal, "doc1", "Document 1 text")
89    >>> self.addDocument(self.portal, "doc2", "Document 2 text")
90
91And in the memeber's folder
92
93    >>> self.addDocument(self.folder, "doc1", "Member Document 1 text")
94    >>> self.addDocument(self.folder, "doc2", "Member Document 2 text")
95
96We need to add sitemap for demonstration.
97
98    >>> browser.open(portal_url + "/prefs_gsm_settings")
99    >>> browser.getControl('Add Content Sitemap').click()
100   
101Now we are landed on the newly-created sitemap edit form.
102What we are interested in is "Blackout entries" field on the edit
103form, it should be empty by default settings.
104
105    >>> file("/tmp/browser.0.html","wb").write(browser.contents)
106    >>> blackout_list = browser.getControl("Blackout entries")
107    >>> blackout_list
108    <Control name='blackout_list:lines' type='textarea'>
109    >>> blackout_list.value == ""
110    True
111    >>> save_button = browser.getControl("Save")
112    >>> save_button
113    <SubmitControl name='form...' type='submit'>
114    >>> save_button.click()
115
116
117Clicking on "Save" button will lead us to the sitemap view.
118
119    >>> print browser.contents
120    <?xml version="1.0" encoding=...
121
122
123"sitemap.xml" link should appear on "Settings" page of the
124Plone Google Sitemap configlet after "Content Sitemap"
125was added.
126
127    >>> browser.open(portal_url + "/prefs_gsm_settings")
128    >>> smedit_link = browser.getLink('sitemap.xml')
129    >>> smedit_url = smedit_link.url
130
131This link points to the newly-created sitemap.xml edit form.
132Let's prepare view link to simplify the following demonstrations.
133
134    >>> smedit_url.endswith("sitemap.xml/edit")
135    True
136    >>> smview_url = smedit_url[:-5]
137
138
139No filters
140==========
141
142The created sitemap has no filters applied and all documents should appear in it.
143
144    >>> browser.open(smview_url)
145    >>> file("/tmp/browser.1.html","wb").write(browser.contents)
146    >>> no_filters_content = browser.contents
147
148Check if result page is really a sitemap...
149
150    >>> print browser.contents
151    <?xml version="1.0" encoding=...
152
153
154Create regular expression, which will help us to test which urls pass the filters.
155
156    >>> import re
157    >>> reloc = re.compile("<loc>%s([^\<]*)</loc>" % self.portal.absolute_url(), re.S)
158
159Test if all 4 documents are in the sitemap without filters.
160
161    >>> no_filters_res = reloc.findall(no_filters_content)
162    >>> no_filters_res.sort()
163    >>> print "\n".join(no_filters_res)
164    /Members/test_user_1_/doc1
165    /Members/test_user_1_/doc2
166    /doc1
167    /doc2
168
169
170Check "id" filter
171=================
172
173Go to the sitemap edit form and add "doc1" line with "id:"
174prefix to the "Blackout entries" field.
175
176    >>> browser.open(smedit_url)
177    >>> filtercontrol = browser.getControl("Blackout entries")
178    >>> filtercontrol.value = """
179    ...     id:doc1
180    ... """
181    >>> browser.getControl("Save").click()
182    >>> id_filter_content = browser.contents
183
184"doc1" document should now be excluded from the
185sitemap.
186
187    >>> id_filter_res = reloc.findall(id_filter_content)
188    >>> id_filter_res.sort()
189    >>> print "\n".join(id_filter_res)
190    /Members/test_user_1_/doc2
191    /doc2
192
193
194Check "path" filter
195===================
196
197Suppose we want to exclude "doc2" document,
198located in test_user_1_ home folder, but leave "doc2"
199untouched in portal root with all other objects.
200
201    >>> browser.open(smedit_url)
202    >>> filtercontrol = browser.getControl("Blackout entries")
203    >>> filtercontrol.value = """
204    ...    path:/Members/test_user_1_/doc2
205    ... """
206    >>> browser.getControl("Save").click()
207    >>> path_filter_content = browser.contents
208
209"/Members/test_user_1_/doc2" object should
210be excluded from the sitemap.
211
212    >>> path_filter_res = reloc.findall(path_filter_content)
213    >>> path_filter_res.sort()
214    >>> print "\n".join(path_filter_res)
215    /Members/test_user_1_/doc1
216    /doc1
217    /doc2
218
219
220Check default filter
221====================
222
223Now I have a question: "What filter will be used when no
224filter name prefix is specified (e.g. old-fashion filters)?"
225
226Go to the sitemap edit form and add "doc1" line
227without any filter name prefix to the "Blackout entries"
228field.
229
230    >>> browser.open(portal_url + "/sitemap.xml/edit")
231    >>> filtercontrol = browser.getControl("Blackout entries")
232    >>> filtercontrol.value = """
233    ...     doc1
234    ... """
235    >>> browser.getControl("Save").click()
236    >>> default_filter_content = browser.contents
237
238"id" filter must be used as default filter. So, "doc1"
239object should be excluded from the sitemap.
240
241    >>> default_filter_res = reloc.findall(default_filter_content)
242    >>> default_filter_res.sort()
243    >>> print "\n".join(default_filter_res)
244    /Members/test_user_1_/doc2
245    /doc2
246
247
248Create your own filters
249=======================
250
251Suppose we want to create our own blackout filter,  which will
252behave like id-filter, but will have some differences. Our fitler
253has the following format:
254
255  (+|-)<filtered id>
256
257- if the 1st sign is "+" then only objects with <filtered id>
258  should be left in sitemap after filetering;
259- if the 1st sign is "-" then all objects with <filtered id>
260  should be excluded from the sitemap (like default id filter).
261
262You need to create new IBlckoutFilter multi-adapter, and register
263it with unique name.
264
265    >>> from zope.component import adapts
266    >>> from zope.interface import Interface, implements
267    >>> from zope.publisher.interfaces.browser import IBrowserRequest
268    >>> from quintagroup.plonegooglesitemaps.interfaces import IBlackoutFilter
269    >>> class SignedIdFilter(object):
270    ...     adapts(Interface, IBrowserRequest)
271    ...     implements(IBlackoutFilter)
272    ...     def __init__(self, context, request):
273    ...         self.context = context
274    ...         self.request = request
275    ...     def filterOut(self, fdata, fargs):
276    ...         sign = fargs[0]
277    ...         fid = fargs[1:]
278    ...         if sign == "+":
279    ...             return [b for b in fdata if b.getId==fid]
280    ...         elif sign == "-":
281    ...             return [b for b in fdata if b.getId!=fid]
282    ...         return fdata
283
284
285Now register this new filter as named multiadapter ...
286
287    >>> from zope.component import provideAdapter
288    >>> provideAdapter(SignedIdFilter,
289    ...                name=u'signedid')
290
291So that's all what needed to add new filter. Now test newly-created
292filter.
293
294Check whether white filtering ("+" prefix) works correctly.
295Go to the sitemap edit form and add "signedid:+doc1"
296to the "Blackout entries" field.
297
298    >>> browser.open(smedit_url)
299    >>> filtercontrol = browser.getControl("Blackout entries")
300    >>> filtercontrol.value = """
301    ...    signedid:+doc1
302    ... """
303    >>> browser.getControl("Save").click()
304    >>> signedid_filter_content = browser.contents
305
306Only objects with "doc1" id should be left in the sitemap.
307
308    >>> signedid_filter_res = reloc.findall(signedid_filter_content)
309    >>> signedid_filter_res.sort()
310    >>> print "\n".join(signedid_filter_res)
311    /Members/test_user_1_/doc1
312    /doc1
313
314
315Finally, check whether black filtering ("-" prefix) works correctly.
316Go to the sitemaps edit form and add "signedid:-doc1" to the "Blackout
317entries" field.
318
319    >>> browser.open(smedit_url)
320    >>> filtercontrol = browser.getControl("Blackout entries")
321    >>> filtercontrol.value = """
322    ...     signedid:-doc1
323    ... """
324    >>> browser.getControl("Save").click()
325    >>> signedid_filter_content = browser.contents
326
327All objects, except those having "doc1" id, must be included in
328the sitemap.
329
330    >>> signedid_filter_res = reloc.findall(signedid_filter_content)
331    >>> signedid_filter_res.sort()
332    >>> print "\n".join(signedid_filter_res)
333    /Members/test_user_1_/doc2
334    /doc2
Note: See TracBrowser for help on using the repository browser.