source: products/quintagroup.plonegooglesitemaps/trunk/quintagroup/plonegooglesitemaps/filters.txt @ 3005

Last change on this file since 3005 was 3005, checked in by olha, 13 years ago

doc files updated

  • Property svn:eol-style set to native
File size: 10.5 KB
RevLine 
[2947]1
2Blackout filtering
3==================
4
[3005]5Introduction
6============
[2947]7
[3005]8Sitemap portal type has an option that filters objects that
9should be excluded from a sitemap. This option is accessable
10on sitemap edit form and is labeled as "Blackout entries".
[2947]11
[2997]12In earlier versions of the package (<4.0.1 for plone-4 branch
13and <3.0.7 for plone-3 branch) this field allowed to
[3005]14filter objects only by their ids, and it looked like:
[2997]15
[2947]16<pre>
17  index.html
18  index_html
19</pre>
20
[3005]21As a result, all objects with "index.html" or "index_html" ids
22were excluded from the sitemap.
[2947]23
[3005]24In the new versions of GoogleSitemaps filtering was refactored
25to pluggable architecture. Now filters turned to be named multi
26adapters. There are only two default filters: "id" and "path".
[2947]27
[2997]28Since different filters can be used - new syntax was applied
[2947]29to the "Blackout entries" field. Every record in the field
[2997]30should follow the specification:
[2947]31 
32  [<filter name>:]<filter arguments>
33
[3005]34* If no <filter name> is specified - "id" filter will be used.
35* If <filter name> is specified - system will look for
36  <filter name>-named  multiadapter to IBlackoutFilter interface.
37  If such multiadapter is not found - filter ill be ignored without
38  raising any errors.
[2947]39
[3005]40The following parts demonstrate how to work with filtering.
41Aspects of default filters ("id" and "path") will also be
42considered.
[2947]43
[3005]44Demonstration environment setup
[2997]45===============================
[2947]46
[3005]47First, we have to do some setup. We use testbrowser that is
[2997]48shipped with Five, as this provides proper Zope 2 integration. Most
49of the documentation, though, is in the underlying zope.testbrower
50package.
51
[2947]52    >>> from Products.Five.testbrowser import Browser
53    >>> browser = Browser()
54    >>> portal_url = self.portal.absolute_url()
55
[3005]56This is useful when writing and debugging testbrowser tests. It lets
57us see all error messages in the error_log.
[2947]58
59    >>> self.portal.error_log._ignored_exceptions = ()
60
[2997]61With that in place, we can go to the portal front page and log in.
62We will do this using the default user from PloneTestCase:
[2947]63
64    >>> from Products.PloneTestCase.setup import portal_owner, default_password
65    >>> browser.open(portal_url)
66
67We have the login portlet, so let's use that.
68
69    >>> browser.open('http://nohost/plone/login_form')
70    >>> browser.getLink('Log in').click()
71    >>> browser.url
72    'http://nohost/plone/login_form'
73    >>> browser.getControl('Login Name').value = portal_owner
74    >>> browser.getControl('Password').value = default_password
75    >>> browser.getControl('Log in').click()
76    >>> "You are now logged in" in browser.contents
77    True
78    >>> "Login failed" in browser.contents
79    False
80    >>> browser.url
81    'http://nohost/plone/login_form'
82
83
84Functionality
85=============
86
[3005]87First, create some content for demonstration purpose.
[2947]88
89In the root of the portal
90
91    >>> self.addDocument(self.portal, "doc1", "Document 1 text")
92    >>> self.addDocument(self.portal, "doc2", "Document 2 text")
93
94And in the memeber's folder
95
96    >>> self.addDocument(self.folder, "doc1", "Member Document 1 text")
97    >>> self.addDocument(self.folder, "doc2", "Member Document 2 text")
98
[2997]99We need to add sitemap for demonstration.
[2947]100
101    >>> browser.open(portal_url + "/prefs_gsm_settings")
102    >>> browser.getControl('Add Content Sitemap').click()
103   
[3005]104Now we are landed on the newly-created sitemap edit form.
105What we are interested in is "Blackout entries" field on the edit
106form, it should be empty by default settings.
[2997]107
[2949]108    >>> file("/tmp/browser.0.html","wb").write(browser.contents)
[2947]109    >>> blackout_list = browser.getControl("Blackout entries")
110    >>> blackout_list
111    <Control name='blackout_list:lines' type='textarea'>
[2949]112    >>> blackout_list.value == ""
113    True
[2948]114    >>> save_button = browser.getControl("Save")
[2947]115    >>> save_button
[2992]116    <SubmitControl name='form...' type='submit'>
[2948]117    >>> save_button.click()
[2947]118
119
[3005]120Clicking on "Save" button will lead us to the sitemap view.
[2947]121
[2950]122    >>> print browser.contents
123    <?xml version="1.0" encoding=...
[2947]124
[2950]125
[3005]126"sitemap.xml" link should appear on "Settings" page of the
127Plone Google Sitemap configlet after "Content Sitemap"
[2997]128was added.
[2947]129
[2949]130    >>> browser.open(portal_url + "/prefs_gsm_settings")
131    >>> smedit_link = browser.getLink('sitemap.xml')
[2950]132    >>> smedit_url = smedit_link.url
[2947]133
[3005]134This link points to the newly-created sitemap.xml edit form.
135Let's prepare view link to simplify the following demonstrations.
[2947]136
[2950]137    >>> smedit_url.endswith("sitemap.xml/edit")
[2949]138    True
[2950]139    >>> smview_url = smedit_url[:-5]
[2949]140
141
142No filters
143==========
144
[3005]145The created sitemap has no filters applied and all documents should appear in it.
[2949]146
[2950]147    >>> browser.open(smview_url)
[2949]148    >>> file("/tmp/browser.1.html","wb").write(browser.contents)
149    >>> no_filters_content = browser.contents
150
[3005]151Check if result page is really a sitemap...
[2949]152
[2950]153    >>> print browser.contents
154    <?xml version="1.0" encoding=...
[2949]155
[2950]156
[3005]157Create regular expression, which will help us to test which urls pass the filters.
[2949]158
159    >>> reloc = re.compile("<loc>%s([^\<]*)</loc>" % self.portal.absolute_url(), re.S)
160
[3005]161Test if all 4 documents and default front-page are in the sitemap without filters.
[2949]162
163    >>> no_filters_res = reloc.findall(no_filters_content)
164    >>> no_filters_res.sort()
165    >>> print "\n".join(no_filters_res)
166    /Members/test_user_1_/doc1
167    /Members/test_user_1_/doc2
168    /doc1
169    /doc2
170    /front-page
171
172
173Check "id" filter
174=================
175
[3005]176Go to the sitemap edit form and add "doc1" and "front-page" lines with "id:"
177prefix to the "Blackout entries" field.
[2949]178
[2950]179    >>> browser.open(smedit_url)
[2949]180    >>> filtercontrol = browser.getControl("Blackout entries")
[2952]181    >>> filtercontrol.value = """
182    ...     id:doc1
183    ...     id:front-page
184    ... """
[2949]185    >>> browser.getControl("Save").click()
186    >>> id_filter_content = browser.contents
187
[3005]188"doc1" and "front-page" documents should now be excluded from the
[2997]189sitemap.
[2949]190
191    >>> id_filter_res = reloc.findall(id_filter_content)
192    >>> id_filter_res.sort()
193    >>> print "\n".join(id_filter_res)
194    /Members/test_user_1_/doc2
195    /doc2
196
197
198Check "path" filter
199===================
200
[3005]201Suppose we want to exclude "front_page" from portal root and "doc2"
202document, located in test_user_1_ home folder, but leave "doc2"
203untouched in portal root with all other objects.
[2949]204
[2950]205    >>> browser.open(smedit_url)
[2949]206    >>> filtercontrol = browser.getControl("Blackout entries")
[2952]207    >>> filtercontrol.value = """
208    ...    path:/Members/test_user_1_/doc2
209    ...    path:/front-page
210    ... """
[2949]211    >>> browser.getControl("Save").click()
212    >>> path_filter_content = browser.contents
213
[2997]214"/Members/test_user_1_/doc2" and "/front_page" objects should
215be excluded from the sitemap.
[2949]216
217    >>> path_filter_res = reloc.findall(path_filter_content)
218    >>> path_filter_res.sort()
219    >>> print "\n".join(path_filter_res)
[3000]220    /Members/test_user_1_/doc1
[2949]221    /doc1
222    /doc2
223
224
225Check default filter
226====================
227
[3005]228Now I have a question: "What filter will be used when no
229filter name prefix is specified (e.g. old-fashion filters)?"
[2949]230
[3005]231Go to the sitemap edit form and add "doc1" and "front-page"
232lines without any filter name prefix to the "Blackout entries"
233field.
[2949]234
235    >>> browser.open(portal_url + "/sitemap.xml/edit")
236    >>> filtercontrol = browser.getControl("Blackout entries")
[2952]237    >>> filtercontrol.value = """
238    ...     doc1
239    ...     front-page
240    ... """
[2949]241    >>> browser.getControl("Save").click()
242    >>> default_filter_content = browser.contents
243
[3005]244"id" filter must be used as default filter. So, all "doc1" and
[2997]245"front-page" objects should be excluded from the sitemap.
[2949]246
247    >>> default_filter_res = reloc.findall(default_filter_content)
248    >>> default_filter_res.sort()
249    >>> print "\n".join(default_filter_res)
250    /Members/test_user_1_/doc2
251    /doc2
252
253
[3005]254Create your own filters
255=======================
[2951]256
[3005]257Suppose we want to create our own blackout filter,  which will
258behave like id-filter, but will have some differences. Our fitler
259has the following format:
[2951]260
261  (+|-)<filtered id>
262
[3005]263- if the 1st sign is "+" then only objects with <filtered id>
264  should be left in sitemap after filetering;
265- if the 1st sign is "-" then all objects with <filtered id>
266  should be excluded from the sitemap (like default id filter).
[2951]267
[3005]268You need to create new IBlckoutFilter multi-adapter, and register
269it with unique name.
[2951]270
271    >>> from zope.component import adapts
272    >>> from zope.interface import Interface, implements
273    >>> from zope.publisher.interfaces.browser import IBrowserRequest
274    >>> from quintagroup.plonegooglesitemaps.interfaces import IBlackoutFilter
275    >>> class SignedIdFilter(object):
276    ...     adapts(Interface, IBrowserRequest)
277    ...     implements(IBlackoutFilter)
278    ...     def __init__(self, context, request):
279    ...         self.context = context
280    ...         self.request = request
281    ...     def filterOut(self, fdata, fargs):
282    ...         sign = fargs[0]
283    ...         fid = fargs[1:]
284    ...         if sign == "+":
285    ...             return [b for b in fdata if b.getId==fid]
286    ...         elif sign == "-":
287    ...             return [b for b in fdata if b.getId!=fid]
288    ...         return fdata
289
290
291Now register this new filter as named multiadapter ...
292
293    >>> from zope.component import provideAdapter
294    >>> provideAdapter(SignedIdFilter,
295    ...                name=u'signedid')
296
[3005]297So that's all what needed to add new filter. Now test newly-created
298filter.
[2951]299
[2997]300Check whether white filtering ("+" prefix) works correctly.
[3005]301Go to the sitemap edit form and add "signedid:+doc1"
[2951]302to the "Blackout entries" field.
303
304    >>> browser.open(smedit_url)
305    >>> filtercontrol = browser.getControl("Blackout entries")
[2952]306    >>> filtercontrol.value = """
307    ...    signedid:+doc1
308    ... """
[2951]309    >>> browser.getControl("Save").click()
310    >>> signedid_filter_content = browser.contents
311
[3005]312Only objects with "doc1" id should be left in the sitemap.
[2951]313
314    >>> signedid_filter_res = reloc.findall(signedid_filter_content)
315    >>> signedid_filter_res.sort()
316    >>> print "\n".join(signedid_filter_res)
317    /Members/test_user_1_/doc1
318    /doc1
319
320
[3005]321Finally, check whether black filtering ("-" prefix) works correctly.
322Go to the sitemaps edit form and add "signedid:-doc1" to the "Blackout
323entries" field.
[2951]324
325    >>> browser.open(smedit_url)
326    >>> filtercontrol = browser.getControl("Blackout entries")
[2952]327    >>> filtercontrol.value = """
328    ...     signedid:-doc1
329    ... """
[2951]330    >>> browser.getControl("Save").click()
331    >>> signedid_filter_content = browser.contents
332
[3005]333All objects, except those having "doc1" id, must be included in
[2997]334the sitemap.
[2951]335
336    >>> signedid_filter_res = reloc.findall(signedid_filter_content)
337    >>> signedid_filter_res.sort()
338    >>> print "\n".join(signedid_filter_res)
339    /Members/test_user_1_/doc2
340    /doc2
341    /front-page
Note: See TracBrowser for help on using the repository browser.