source: products/quintagroup.plonegooglesitemaps/trunk/quintagroup/plonegooglesitemaps/filters.txt @ 3002

Last change on this file since 3002 was 3002, checked in by mylan, 11 years ago

Merged revisions 3948,3950-3951,3954-3964,3978-3981,3984-3992,4016-4018,4028-4037,4039 via svnmerge from
http://svn.quintagroup.com/products/quintagroup.plonegooglesitemaps/branches/blacklist

........

r3948 | mylan | 2010-10-21 16:34:22 +0300 (Thu, 21 Oct 2010) | 1 line


#228: Added IBlackoutFilterUtility interface with skeleton of id and path utilities with appropriate tests.

........

r3950 | mylan | 2010-10-22 11:58:53 +0300 (Fri, 22 Oct 2010) | 1 line


#228: Added tests for default id and path filters

........

r3951 | mylan | 2010-10-22 11:59:25 +0300 (Fri, 22 Oct 2010) | 1 line


#228: Added default id and path blackout filters

........

r3954 | mylan | 2010-10-22 14:38:20 +0300 (Fri, 22 Oct 2010) | 1 line


#228: Added TestFormDataProcessing? tests

........

r3955 | mylan | 2010-10-22 14:38:57 +0300 (Fri, 22 Oct 2010) | 1 line


#228: Added getBOFiltered method to sitemap common view

........

r3956 | mylan | 2010-10-22 15:17:40 +0300 (Fri, 22 Oct 2010) | 1 line


#228: Simplify blacklists filter utility tests

........

r3957 | mylan | 2010-10-22 15:23:24 +0300 (Fri, 22 Oct 2010) | 1 line


#228: Minor name updates in blackout list tests

........

r3958 | mylan | 2010-10-22 15:54:10 +0300 (Fri, 22 Oct 2010) | 1 line


#228: Added tests for relative path in path-filter

........

r3959 | mylan | 2010-10-22 15:54:50 +0300 (Fri, 22 Oct 2010) | 1 line


#228: Added relative path processing in default path-filter

........

r3960 | mylan | 2010-10-22 15:57:18 +0300 (Fri, 22 Oct 2010) | 1 line


#228: Fixed name of the relative path filtering test

........

r3961 | mylan | 2010-10-25 13:51:32 +0300 (Mon, 25 Oct 2010) | 1 line


#228: Fix relative path filter tests and utility

........

r3962 | mylan | 2010-10-25 17:38:55 +0300 (Mon, 25 Oct 2010) | 1 line


#228: Fixed default Path filter

........

r3963 | mylan | 2010-10-25 17:49:27 +0300 (Mon, 25 Oct 2010) | 1 line


#228: Added request to extended list of args for a filter utility.

........

r3964 | mylan | 2010-10-25 17:50:57 +0300 (Mon, 25 Oct 2010) | 1 line


#228: Fixed rewrited blackout filter utility and functionality.

........

r3978 | mylan | 2010-10-28 19:26:11 +0300 (Thu, 28 Oct 2010) | 1 line


#228: Remake filter utility to multiadapter, fix tests

........

r3979 | mylan | 2010-10-28 19:57:24 +0300 (Thu, 28 Oct 2010) | 1 line


#228: Fix filter naming overhead, fix tests

........

r3980 | mylan | 2010-10-28 21:10:51 +0300 (Thu, 28 Oct 2010) | 1 line


#228: Fix overhead in parsing of filter arguments. Added test (breakage yet)

........

r3981 | mylan | 2010-10-29 12:57:34 +0300 (Fri, 29 Oct 2010) | 1 line


#228: Add blackout_list value clean-up on editing sitemap

........

r3984 | mylan | 2010-10-29 13:53:54 +0300 (Fri, 29 Oct 2010) | 1 line


#228: Update description of blackout list, remove preparations to filtering

........

r3985 | mylan | 2010-10-29 16:14:29 +0300 (Fri, 29 Oct 2010) | 1 line


#228: Added doc tests bases

........

r3986 | mylan | 2010-10-29 16:53:41 +0300 (Fri, 29 Oct 2010) | 1 line


#228: Added basic doctests of filtering

........

r3987 | mylan | 2010-10-29 16:57:40 +0300 (Fri, 29 Oct 2010) | 1 line


#228: Minor fixes of filtering doctests.

........

r3988 | mylan | 2010-11-01 13:31:04 +0200 (Mon, 01 Nov 2010) | 1 line


#228: Added doctests of id, path, default filters

........

r3989 | mylan | 2010-11-01 15:20:03 +0200 (Mon, 01 Nov 2010) | 1 line


#228: Fixed default behavior filter doctests

........

r3990 | mylan | 2010-11-01 15:20:51 +0200 (Mon, 01 Nov 2010) | 1 line


#228: Added example of new filter creation in doctests

........

r3991 | mylan | 2010-11-01 15:35:10 +0200 (Mon, 01 Nov 2010) | 1 line


#228: Clean-up, simplify filters doctests

........

r3992 | mylan | 2010-11-01 15:39:55 +0200 (Mon, 01 Nov 2010) | 1 line


#228: Force to show all doctests failures.

........

r4016 | mylan | 2010-11-05 14:19:46 +0200 (Fri, 05 Nov 2010) | 1 line


#228: Fix incorrect black_list field editing in tests

........

r4017 | mylan | 2010-11-05 14:21:35 +0200 (Fri, 05 Nov 2010) | 1 line


#228: minor docstring update for getBOFiltered method

........

r4018 | mylan | 2010-11-05 14:23:56 +0200 (Fri, 05 Nov 2010) | 1 line


#228: Remake filterOut method of into generator for default filters

........

r4028 | mylan | 2010-11-08 13:58:10 +0200 (Mon, 08 Nov 2010) | 1 line


#228: Added Plone v3.0 support

........

r4029 | mylan | 2010-11-08 16:35:22 +0200 (Mon, 08 Nov 2010) | 1 line


#228: Fixed tests issues

........

r4030 | mylan | 2010-11-08 16:49:28 +0200 (Mon, 08 Nov 2010) | 1 line


#228: Fixed ran of tearing down testing layers

........

r4031 | mylan | 2010-11-08 17:04:01 +0200 (Mon, 08 Nov 2010) | 1 line


#228: Fixed differencies in forms for plone<=3.1 and plone>3.1

........

r4032 | mylan | 2010-11-10 13:48:47 +0200 (Wed, 10 Nov 2010) | 1 line


#228: Switch to use collective.testcaselayer

........

r4033 | mylan | 2010-11-10 13:49:09 +0200 (Wed, 10 Nov 2010) | 1 line


#228: Fix doctest testcase

........

r4034 | mylan | 2010-11-10 14:34:17 +0200 (Wed, 10 Nov 2010) | 1 line


#228: Fixed list of required packages for testing

........

r4035 | mylan | 2010-11-11 12:29:06 +0200 (Thu, 11 Nov 2010) | 1 line


#228: remake filters to generators

........

r4036 | mylan | 2010-11-11 15:16:53 +0200 (Thu, 11 Nov 2010) | 1 line


#228: reviewed explanation, correct grammar for filters doctest

........

r4037 | mylan | 2010-11-11 15:19:24 +0200 (Thu, 11 Nov 2010) | 1 line


#228: updated histroy, bumped version to 1.6.0

........

r4039 | mylan | 2010-11-11 19:03:01 +0200 (Thu, 11 Nov 2010) | 1 line


#228:Fixed typo in doctests

........

  • Property svn:eol-style set to native
File size: 10.5 KB
Line 
1
2Blackout filtering
3==================
4
5Filtering introductioin
6=======================
7
8Sitemap portal type has an option, designed to filter out
9objects which should be excluded from a sitemap. This option is
10accessable in sitemap edit form and is labeled as
11"Blackout entries".
12
13In earlier versions of the package (<4.0.1 for plone-4 branch
14and <3.0.7 for plone-3 branch) this field allowed to
15filter objects only by their ids, and looked like:
16
17<pre>
18  index.html
19  index_html
20</pre>
21
22So, all objects with "index.html" or "index_html" ids were
23excluded from the sitemap.
24
25In the new versions of GoogleSitemaps filtering was remade
26to pluggable architecture. Now filters became named multi
27adapters. There are only two default filters - "id" and
28"path".
29
30Since different filters can be used - new syntax was applied
31to the "Blackout entries" field. Every record in the field
32should follow the specification:
33 
34  [<filter name>:]<filter arguments>
35
36If no <filter name> is specified - "id" filter will
37be used. If <filter name> is specified - system will look
38for <filter name>-named  multiadapter to IBlackoutFilter
39interface. If such multiadapter is not found - filter
40will be ignored without raising any errors.
41
42
43Following parts demonstrate working of the filtering.
44Aspects of default filters ("id" and "path")
45are considered yet.
46
47Setup demonstration environment
48===============================
49
50First, we must perform some setup. We use the testbrowser that is
51shipped with Five, as this provides proper Zope 2 integration. Most
52of the documentation, though, is in the underlying zope.testbrower
53package.
54
55    >>> from Products.Five.testbrowser import Browser
56    >>> browser = Browser()
57    >>> portal_url = self.portal.absolute_url()
58
59The following is useful when writing and debugging testbrowser tests.
60It lets us see all error messages in the error_log.
61
62    >>> self.portal.error_log._ignored_exceptions = ()
63
64With that in place, we can go to the portal front page and log in.
65We will do this using the default user from PloneTestCase:
66
67    >>> from Products.PloneTestCase.setup import portal_owner, default_password
68    >>> browser.open(portal_url)
69
70We have the login portlet, so let's use that.
71
72    >>> browser.open('http://nohost/plone/login_form')
73    >>> browser.getLink('Log in').click()
74    >>> browser.url
75    'http://nohost/plone/login_form'
76    >>> browser.getControl('Login Name').value = portal_owner
77    >>> browser.getControl('Password').value = default_password
78    >>> browser.getControl('Log in').click()
79    >>> "You are now logged in" in browser.contents
80    True
81    >>> "Login failed" in browser.contents
82    False
83    >>> browser.url
84    'http://nohost/plone/login_form'
85
86
87Functionality
88=============
89
90First create some content for demonstrations.
91
92In the root of the portal
93
94    >>> self.addDocument(self.portal, "doc1", "Document 1 text")
95    >>> self.addDocument(self.portal, "doc2", "Document 2 text")
96
97And in the memeber's folder
98
99    >>> self.addDocument(self.folder, "doc1", "Member Document 1 text")
100    >>> self.addDocument(self.folder, "doc2", "Member Document 2 text")
101
102We need to add sitemap for demonstration.
103
104    >>> browser.open(portal_url + "/prefs_gsm_settings")
105    >>> browser.getControl('Add Content Sitemap').click()
106   
107Now we bring-up to edit form of the newly created content sitemap.
108We are interested in two things: "Blackout entries" field must
109present in the form and by default it should be empty.
110
111    >>> file("/tmp/browser.0.html","wb").write(browser.contents)
112    >>> blackout_list = browser.getControl("Blackout entries")
113    >>> blackout_list
114    <Control name='blackout_list:lines' type='textarea'>
115    >>> blackout_list.value == ""
116    True
117    >>> save_button = browser.getControl("Save")
118    >>> save_button
119    <SubmitControl name='form...' type='submit'>
120    >>> save_button.click()
121
122
123Click on "Save" button lead us to result sitemap view.
124
125    >>> print browser.contents
126    <?xml version="1.0" encoding=...
127
128
129"sitemap.xml" link should appear in "Settings" page of the
130Plone Google Sitemap configlet when "Content Sitemap"
131was added.
132
133    >>> browser.open(portal_url + "/prefs_gsm_settings")
134    >>> smedit_link = browser.getLink('sitemap.xml')
135    >>> smedit_url = smedit_link.url
136
137This link points to edit form of the newly created sitemap.xml.
138Let prepare view link to simplifier following demonstrations.
139
140    >>> smedit_url.endswith("sitemap.xml/edit")
141    True
142    >>> smview_url = smedit_url[:-5]
143
144
145No filters
146==========
147
148Created sitemap has no filters and all documents should appear in it.
149
150    >>> browser.open(smview_url)
151    >>> file("/tmp/browser.1.html","wb").write(browser.contents)
152    >>> no_filters_content = browser.contents
153
154Check if resulted page really is sitemap...
155
156    >>> print browser.contents
157    <?xml version="1.0" encoding=...
158
159
160Create regular expression, which help us to test which urls pass the filters.
161
162    >>> reloc = re.compile("<loc>%s([^\<]*)</loc>" % self.portal.absolute_url(), re.S)
163
164Test if all 4 documents and default front-page present in the sitemap
165without filters.
166
167    >>> no_filters_res = reloc.findall(no_filters_content)
168    >>> no_filters_res.sort()
169    >>> print "\n".join(no_filters_res)
170    /Members/test_user_1_/doc1
171    /Members/test_user_1_/doc2
172    /doc1
173    /doc2
174    /front-page
175
176
177Check "id" filter
178=================
179
180Go to the edit form of the sitemap and add "doc1"
181and "front-page" lines with "id:" prefix to the
182"Blackout entries" field.
183
184    >>> browser.open(smedit_url)
185    >>> filtercontrol = browser.getControl("Blackout entries")
186    >>> filtercontrol.value = """
187    ...     id:doc1
188    ...     id:front-page
189    ... """
190    >>> browser.getControl("Save").click()
191    >>> id_filter_content = browser.contents
192
193"doc1" and "front-page" documents should be excluded from the
194sitemap.
195
196    >>> id_filter_res = reloc.findall(id_filter_content)
197    >>> id_filter_res.sort()
198    >>> print "\n".join(id_filter_res)
199    /Members/test_user_1_/doc2
200    /doc2
201
202
203Check "path" filter
204===================
205
206Suppouse we wont to exclude the "front_page" from portal root
207and "doc2" document, located in test_user_1_ home folder,
208but leave untouched "doc2" in portal root with all other objects.
209
210    >>> browser.open(smedit_url)
211    >>> filtercontrol = browser.getControl("Blackout entries")
212    >>> filtercontrol.value = """
213    ...    path:/Members/test_user_1_/doc2
214    ...    path:/front-page
215    ... """
216    >>> browser.getControl("Save").click()
217    >>> path_filter_content = browser.contents
218
219"/Members/test_user_1_/doc2" and "/front_page" objects should
220be excluded from the sitemap.
221
222    >>> path_filter_res = reloc.findall(path_filter_content)
223    >>> path_filter_res.sort()
224    >>> print "\n".join(path_filter_res)
225    /Members/test_user_1_/doc1
226    /doc1
227    /doc2
228
229
230Check default filter
231====================
232
233Now I have the question: "What filter will be used when no
234filter name prefix was specified (old-fashion filters for
235example)?"
236
237Go to the edit form of the sitemap and add "doc1" and
238"front-page" lines without any filter name prefix to the
239"Blackout entries" field.
240
241    >>> browser.open(portal_url + "/sitemap.xml/edit")
242    >>> filtercontrol = browser.getControl("Blackout entries")
243    >>> filtercontrol.value = """
244    ...     doc1
245    ...     front-page
246    ... """
247    >>> browser.getControl("Save").click()
248    >>> default_filter_content = browser.contents
249
250"id" filter must be used as default filter. So all "doc1" and
251"front-page" objects should be excluded from the sitemap.
252
253    >>> default_filter_res = reloc.findall(default_filter_content)
254    >>> default_filter_res.sort()
255    >>> print "\n".join(default_filter_res)
256    /Members/test_user_1_/doc2
257    /doc2
258
259
260Creation own filters
261====================
262
263Suppouse we want to create own blackout filter,
264which behave like id-filter, but has some differencies.
265Our fitler has following format:
266
267  (+|-)<filtered id>
268
269  - when 1st sign is "+" then only objects with <filtered id>
270    should be leaved in sitemap after filetering;
271  - if 1st sign is "-" then all objects with <filtered id>
272    should be excluded from the sitemap (like default id
273    filter).
274
275You need create new IBlckoutFilter multi-adapter,
276and register it with unique name.
277
278    >>> from zope.component import adapts
279    >>> from zope.interface import Interface, implements
280    >>> from zope.publisher.interfaces.browser import IBrowserRequest
281    >>> from quintagroup.plonegooglesitemaps.interfaces import IBlackoutFilter
282    >>> class SignedIdFilter(object):
283    ...     adapts(Interface, IBrowserRequest)
284    ...     implements(IBlackoutFilter)
285    ...     def __init__(self, context, request):
286    ...         self.context = context
287    ...         self.request = request
288    ...     def filterOut(self, fdata, fargs):
289    ...         sign = fargs[0]
290    ...         fid = fargs[1:]
291    ...         if sign == "+":
292    ...             return [b for b in fdata if b.getId==fid]
293    ...         elif sign == "-":
294    ...             return [b for b in fdata if b.getId!=fid]
295    ...         return fdata
296
297
298Now register this new filter as named multiadapter ...
299
300    >>> from zope.component import provideAdapter
301    >>> provideAdapter(SignedIdFilter,
302    ...                name=u'signedid')
303
304So that's all what needed to add new filter.
305Now test newly created filter.
306
307Check whether white filtering ("+" prefix) works correctly.
308Go to edit form of the sitemap and add "signedid:+doc1"
309to the "Blackout entries" field.
310
311    >>> browser.open(smedit_url)
312    >>> filtercontrol = browser.getControl("Blackout entries")
313    >>> filtercontrol.value = """
314    ...    signedid:+doc1
315    ... """
316    >>> browser.getControl("Save").click()
317    >>> signedid_filter_content = browser.contents
318
319Only objects with "doc1" id should be leaved in the sitemap.
320
321    >>> signedid_filter_res = reloc.findall(signedid_filter_content)
322    >>> signedid_filter_res.sort()
323    >>> print "\n".join(signedid_filter_res)
324    /Members/test_user_1_/doc1
325    /doc1
326
327
328And for the last - check wheter black filtering ("-" prefix)
329works correctly.
330Go to the edit form of the sitemap and add "signedid:-doc1"
331to the "Blackout entries" field.
332
333    >>> browser.open(smedit_url)
334    >>> filtercontrol = browser.getControl("Blackout entries")
335    >>> filtercontrol.value = """
336    ...     signedid:-doc1
337    ... """
338    >>> browser.getControl("Save").click()
339    >>> signedid_filter_content = browser.contents
340
341All objects except those having "doc1" id must be included in
342the sitemap.
343
344    >>> signedid_filter_res = reloc.findall(signedid_filter_content)
345    >>> signedid_filter_res.sort()
346    >>> print "\n".join(signedid_filter_res)
347    /Members/test_user_1_/doc2
348    /doc2
349    /front-page
Note: See TracBrowser for help on using the repository browser.