source: products/quintagroup.plonegooglesitemaps/trunk/quintagroup/plonegooglesitemaps/filters.txt @ 3005

Last change on this file since 3005 was 3005, checked in by olha, 13 years ago

doc files updated

  • Property svn:eol-style set to native
File size: 10.5 KB
Line 
1
2Blackout filtering
3==================
4
5Introduction
6============
7
8Sitemap portal type has an option that filters objects that
9should be excluded from a sitemap. This option is accessable
10on sitemap edit form and is labeled as "Blackout entries".
11
12In earlier versions of the package (<4.0.1 for plone-4 branch
13and <3.0.7 for plone-3 branch) this field allowed to
14filter objects only by their ids, and it looked like:
15
16<pre>
17  index.html
18  index_html
19</pre>
20
21As a result, all objects with "index.html" or "index_html" ids
22were excluded from the sitemap.
23
24In the new versions of GoogleSitemaps filtering was refactored
25to pluggable architecture. Now filters turned to be named multi
26adapters. There are only two default filters: "id" and "path".
27
28Since different filters can be used - new syntax was applied
29to the "Blackout entries" field. Every record in the field
30should follow the specification:
31 
32  [<filter name>:]<filter arguments>
33
34* If no <filter name> is specified - "id" filter will be used.
35* If <filter name> is specified - system will look for
36  <filter name>-named  multiadapter to IBlackoutFilter interface.
37  If such multiadapter is not found - filter ill be ignored without
38  raising any errors.
39
40The following parts demonstrate how to work with filtering.
41Aspects of default filters ("id" and "path") will also be
42considered.
43
44Demonstration environment setup
45===============================
46
47First, we have to do some setup. We use testbrowser that is
48shipped with Five, as this provides proper Zope 2 integration. Most
49of the documentation, though, is in the underlying zope.testbrower
50package.
51
52    >>> from Products.Five.testbrowser import Browser
53    >>> browser = Browser()
54    >>> portal_url = self.portal.absolute_url()
55
56This is useful when writing and debugging testbrowser tests. It lets
57us see all error messages in the error_log.
58
59    >>> self.portal.error_log._ignored_exceptions = ()
60
61With that in place, we can go to the portal front page and log in.
62We will do this using the default user from PloneTestCase:
63
64    >>> from Products.PloneTestCase.setup import portal_owner, default_password
65    >>> browser.open(portal_url)
66
67We have the login portlet, so let's use that.
68
69    >>> browser.open('http://nohost/plone/login_form')
70    >>> browser.getLink('Log in').click()
71    >>> browser.url
72    'http://nohost/plone/login_form'
73    >>> browser.getControl('Login Name').value = portal_owner
74    >>> browser.getControl('Password').value = default_password
75    >>> browser.getControl('Log in').click()
76    >>> "You are now logged in" in browser.contents
77    True
78    >>> "Login failed" in browser.contents
79    False
80    >>> browser.url
81    'http://nohost/plone/login_form'
82
83
84Functionality
85=============
86
87First, create some content for demonstration purpose.
88
89In the root of the portal
90
91    >>> self.addDocument(self.portal, "doc1", "Document 1 text")
92    >>> self.addDocument(self.portal, "doc2", "Document 2 text")
93
94And in the memeber's folder
95
96    >>> self.addDocument(self.folder, "doc1", "Member Document 1 text")
97    >>> self.addDocument(self.folder, "doc2", "Member Document 2 text")
98
99We need to add sitemap for demonstration.
100
101    >>> browser.open(portal_url + "/prefs_gsm_settings")
102    >>> browser.getControl('Add Content Sitemap').click()
103   
104Now we are landed on the newly-created sitemap edit form.
105What we are interested in is "Blackout entries" field on the edit
106form, it should be empty by default settings.
107
108    >>> file("/tmp/browser.0.html","wb").write(browser.contents)
109    >>> blackout_list = browser.getControl("Blackout entries")
110    >>> blackout_list
111    <Control name='blackout_list:lines' type='textarea'>
112    >>> blackout_list.value == ""
113    True
114    >>> save_button = browser.getControl("Save")
115    >>> save_button
116    <SubmitControl name='form...' type='submit'>
117    >>> save_button.click()
118
119
120Clicking on "Save" button will lead us to the sitemap view.
121
122    >>> print browser.contents
123    <?xml version="1.0" encoding=...
124
125
126"sitemap.xml" link should appear on "Settings" page of the
127Plone Google Sitemap configlet after "Content Sitemap"
128was added.
129
130    >>> browser.open(portal_url + "/prefs_gsm_settings")
131    >>> smedit_link = browser.getLink('sitemap.xml')
132    >>> smedit_url = smedit_link.url
133
134This link points to the newly-created sitemap.xml edit form.
135Let's prepare view link to simplify the following demonstrations.
136
137    >>> smedit_url.endswith("sitemap.xml/edit")
138    True
139    >>> smview_url = smedit_url[:-5]
140
141
142No filters
143==========
144
145The created sitemap has no filters applied and all documents should appear in it.
146
147    >>> browser.open(smview_url)
148    >>> file("/tmp/browser.1.html","wb").write(browser.contents)
149    >>> no_filters_content = browser.contents
150
151Check if result page is really a sitemap...
152
153    >>> print browser.contents
154    <?xml version="1.0" encoding=...
155
156
157Create regular expression, which will help us to test which urls pass the filters.
158
159    >>> reloc = re.compile("<loc>%s([^\<]*)</loc>" % self.portal.absolute_url(), re.S)
160
161Test if all 4 documents and default front-page are in the sitemap without filters.
162
163    >>> no_filters_res = reloc.findall(no_filters_content)
164    >>> no_filters_res.sort()
165    >>> print "\n".join(no_filters_res)
166    /Members/test_user_1_/doc1
167    /Members/test_user_1_/doc2
168    /doc1
169    /doc2
170    /front-page
171
172
173Check "id" filter
174=================
175
176Go to the sitemap edit form and add "doc1" and "front-page" lines with "id:"
177prefix to the "Blackout entries" field.
178
179    >>> browser.open(smedit_url)
180    >>> filtercontrol = browser.getControl("Blackout entries")
181    >>> filtercontrol.value = """
182    ...     id:doc1
183    ...     id:front-page
184    ... """
185    >>> browser.getControl("Save").click()
186    >>> id_filter_content = browser.contents
187
188"doc1" and "front-page" documents should now be excluded from the
189sitemap.
190
191    >>> id_filter_res = reloc.findall(id_filter_content)
192    >>> id_filter_res.sort()
193    >>> print "\n".join(id_filter_res)
194    /Members/test_user_1_/doc2
195    /doc2
196
197
198Check "path" filter
199===================
200
201Suppose we want to exclude "front_page" from portal root and "doc2"
202document, located in test_user_1_ home folder, but leave "doc2"
203untouched in portal root with all other objects.
204
205    >>> browser.open(smedit_url)
206    >>> filtercontrol = browser.getControl("Blackout entries")
207    >>> filtercontrol.value = """
208    ...    path:/Members/test_user_1_/doc2
209    ...    path:/front-page
210    ... """
211    >>> browser.getControl("Save").click()
212    >>> path_filter_content = browser.contents
213
214"/Members/test_user_1_/doc2" and "/front_page" objects should
215be excluded from the sitemap.
216
217    >>> path_filter_res = reloc.findall(path_filter_content)
218    >>> path_filter_res.sort()
219    >>> print "\n".join(path_filter_res)
220    /Members/test_user_1_/doc1
221    /doc1
222    /doc2
223
224
225Check default filter
226====================
227
228Now I have a question: "What filter will be used when no
229filter name prefix is specified (e.g. old-fashion filters)?"
230
231Go to the sitemap edit form and add "doc1" and "front-page"
232lines without any filter name prefix to the "Blackout entries"
233field.
234
235    >>> browser.open(portal_url + "/sitemap.xml/edit")
236    >>> filtercontrol = browser.getControl("Blackout entries")
237    >>> filtercontrol.value = """
238    ...     doc1
239    ...     front-page
240    ... """
241    >>> browser.getControl("Save").click()
242    >>> default_filter_content = browser.contents
243
244"id" filter must be used as default filter. So, all "doc1" and
245"front-page" objects should be excluded from the sitemap.
246
247    >>> default_filter_res = reloc.findall(default_filter_content)
248    >>> default_filter_res.sort()
249    >>> print "\n".join(default_filter_res)
250    /Members/test_user_1_/doc2
251    /doc2
252
253
254Create your own filters
255=======================
256
257Suppose we want to create our own blackout filter,  which will
258behave like id-filter, but will have some differences. Our fitler
259has the following format:
260
261  (+|-)<filtered id>
262
263- if the 1st sign is "+" then only objects with <filtered id>
264  should be left in sitemap after filetering;
265- if the 1st sign is "-" then all objects with <filtered id>
266  should be excluded from the sitemap (like default id filter).
267
268You need to create new IBlckoutFilter multi-adapter, and register
269it with unique name.
270
271    >>> from zope.component import adapts
272    >>> from zope.interface import Interface, implements
273    >>> from zope.publisher.interfaces.browser import IBrowserRequest
274    >>> from quintagroup.plonegooglesitemaps.interfaces import IBlackoutFilter
275    >>> class SignedIdFilter(object):
276    ...     adapts(Interface, IBrowserRequest)
277    ...     implements(IBlackoutFilter)
278    ...     def __init__(self, context, request):
279    ...         self.context = context
280    ...         self.request = request
281    ...     def filterOut(self, fdata, fargs):
282    ...         sign = fargs[0]
283    ...         fid = fargs[1:]
284    ...         if sign == "+":
285    ...             return [b for b in fdata if b.getId==fid]
286    ...         elif sign == "-":
287    ...             return [b for b in fdata if b.getId!=fid]
288    ...         return fdata
289
290
291Now register this new filter as named multiadapter ...
292
293    >>> from zope.component import provideAdapter
294    >>> provideAdapter(SignedIdFilter,
295    ...                name=u'signedid')
296
297So that's all what needed to add new filter. Now test newly-created
298filter.
299
300Check whether white filtering ("+" prefix) works correctly.
301Go to the sitemap edit form and add "signedid:+doc1"
302to the "Blackout entries" field.
303
304    >>> browser.open(smedit_url)
305    >>> filtercontrol = browser.getControl("Blackout entries")
306    >>> filtercontrol.value = """
307    ...    signedid:+doc1
308    ... """
309    >>> browser.getControl("Save").click()
310    >>> signedid_filter_content = browser.contents
311
312Only objects with "doc1" id should be left in the sitemap.
313
314    >>> signedid_filter_res = reloc.findall(signedid_filter_content)
315    >>> signedid_filter_res.sort()
316    >>> print "\n".join(signedid_filter_res)
317    /Members/test_user_1_/doc1
318    /doc1
319
320
321Finally, check whether black filtering ("-" prefix) works correctly.
322Go to the sitemaps edit form and add "signedid:-doc1" to the "Blackout
323entries" field.
324
325    >>> browser.open(smedit_url)
326    >>> filtercontrol = browser.getControl("Blackout entries")
327    >>> filtercontrol.value = """
328    ...     signedid:-doc1
329    ... """
330    >>> browser.getControl("Save").click()
331    >>> signedid_filter_content = browser.contents
332
333All objects, except those having "doc1" id, must be included in
334the sitemap.
335
336    >>> signedid_filter_res = reloc.findall(signedid_filter_content)
337    >>> signedid_filter_res.sort()
338    >>> print "\n".join(signedid_filter_res)
339    /Members/test_user_1_/doc2
340    /doc2
341    /front-page
Note: See TracBrowser for help on using the repository browser.