Prevent of bolding entire content pasted from google docs #62

msamsel · 2019-07-16T11:42:35Z

Suggested merge commit message (convention)

Feature: Prevent of bolding entire content pasted from google docs. Closes ckeditor/ckeditor5#2491 .

Additional information

Refactoring of the automatic tests. After this change there will be mocked up entire clipboard event in normalization tests, not only fired normalization function for MS Word.
There are also small changes in the description of unit tests for better separation and readability. It is really had to figure out from which file which tests are run as there is way too much of subgroups and indention which start to mess up. Mocha does not combine subgroups Group up descriptions with same names mochajs/mocha#1413 from different files.

…pboard event. Extend unit test with information about data source. Some small docs and test naming improvements.

…tType method, remove dataSource from autoamtic tests which was confusing.

…l typo fixes.

msamsel · 2019-07-17T12:56:29Z

@Mgsy maybe you will have a moment to click over this PR and find some strangely defined Google Docs document which still will be pasted with bold.

Just please be aware that PR fixes only the situation when the entire pasted text became bolded. Support for the basic styles, lists and other feature will be introduced later in some following PRs.

Mgsy · 2019-07-18T08:59:28Z

Unfortunately, this fix doesn't work on Windows.

…tion.

msamsel · 2019-07-18T15:08:51Z

@Mgsy can you take a look one more time. It should work now.

jodator

I was reviewing this for some time an tried to remove the need for static methods - they are artificial IMO for test purposes only and I think that we can spend some more time to clean some things here a bit. And probably make PFO more maintainable for the future (pasting from other office suites or editor types - ie excel).

In order to start this, we need to know the API for that - namely, what must be passed do the normalize method? In the _inputTransformationListener() method there's either html & dataTransfer passed or data.content. This part needs to be a bit unified - it would be nice to have at most 2 params (where the first in the data which need to be processed and the latter additional data (like dataTransfer for Word).

I'm thinking about private API entirely for now - just to refactor the switch and make the PFO tastable better.

The basic idea is to create an interface - from what I see ATM something basic like:

interface Normalizer {
    isActive( html ); // bad name - anyway t
    normalize( html, dataTransfer )
}

This way we will be able to:

do not expose private methods for testing (ie by creating stub with the same API as Normalizer that will check if proper normalizer is called and if it is called only once)
test normalizers independently (or just as integration test as now - doesn't really matter)

jodator · 2019-07-18T14:07:17Z

src/pastefromoffice.js

+	 * Listener fired during {@link module:clipboard/clipboard~Clipboard#event:inputTransformation `inputTransformation` event}.
+	 * Detects if content comes from a recognized source and normalize it.
+	 *
+	 * **Note**: this function was exposed mainly for testing purposes and should not be called directly.


Suggested change

* **Note**: this function was exposed mainly for testing purposes and should not be called directly.

* **Note**: this function is exposed mainly for testing purposes and should not be called directly.

jodator · 2019-07-18T14:16:18Z

src/pastefromoffice.js

+					data.content = PasteFromOffice._normalizeWordInput( html, data.dataTransfer );
+					break;
+				case 'gdocs':
+					data.content = PasteFromOffice._normalizeGoogleDocsInput( data.content );


Why is here data.content taken directly and not the html as for word input? At least some explanation is needed.

src/filters/common.js

jodator · 2019-07-18T14:32:44Z

tests/_utils/utils.js

+					} )
+				};
+
+				PasteFromOffice._inputTransformationListener( null, data );


🔥 can be used as I checked that:

editor.plugins.get( 'Clipboard' ).fire( 'inputTransformation', data );

jodator · 2019-07-18T14:33:19Z

src/pastefromoffice.js

+	 * @param {module:utils/eventinfo~EventInfo} evt
+	 * @param {Object} data same structure like {@link module:clipboard/clipboard~Clipboard#event:inputTransformation input transformation}
+	 */
+	static _inputTransformationListener( evt, data ) {


Usually, we do not use such static methods. Anyway, this method was exported only for test purposes and wasn't needed to be exposed at all, so let's try to fix this (check this: https://github.com/ckeditor/ckeditor5-paste-from-office/pull/62/files#diff-01566d953bd66051289510b52b899e4bR166)

…ct them to separate files. Clean up in contentnormalizer api.

…ually.

Mgsy

LGTM.

…alizers functionalities.

mlewand · 2019-07-29T07:59:18Z

Since there has been quite a few changes, @Mgsy can I ask you to verify once again the fix?

msamsel · 2019-07-29T08:04:13Z

@Mgsy please also check if nothing went wrong with MS Word, as there was also changed way how MS Word filters start to be fired.

jodator

We're almost there ;) Please take a look at the comments, most importantly:

There is too much [].forEach() IMO used in test - some of them are redundant and prevents us from writing a clear explanation of what this particular test tests other than case #.
Some corrections to the docs are needed.
The code to be moved around (namespaces names).

I' malso thinking about common API for filters but I think that we can live with a current state of things. II'll create a follow up for that (or update a current one if existing)>

jodator · 2019-07-29T09:23:14Z

src/filters/common.js

+ */
+
+/**
+ * @module paste-from-office/filters/common


I'm not sure if this is a common filter - so maybe we should just move it to removeboldtagwrapper.js.

jodator · 2019-07-29T09:25:08Z

src/filters/common.js

+ */
+export function removeBoldTagWrapper( { documentFragment, writer } ) {
+	for ( const childWithWrapper of documentFragment.getChildren() ) {
+		if ( childWithWrapper.is( 'b' ) && childWithWrapper.getStyle( 'font-weight' ) === 'normal' ) {


child would be enough - better to read here :)

jodator · 2019-07-29T09:28:50Z

src/filters/common.js

+	for ( const childWithWrapper of documentFragment.getChildren() ) {
+		if ( childWithWrapper.is( 'b' ) && childWithWrapper.getStyle( 'font-weight' ) === 'normal' ) {
+			const childIndex = documentFragment.getChildIndex( childWithWrapper );
+			const removedElement = writer.remove( childWithWrapper )[ 0 ];


I'd avoid such constructs if possible:

writer.remove( child ); writer.insertChild( index, child.getChildren(), docuemntFragment );

also will work.

jodator · 2019-07-29T09:29:29Z

src/filters/common.js

+/**
+ * Removes `<b>` tag wrapper added by Google Docs to a copied content.
+ *
+ * @param {module:engine/view/documentfragment~DocumentFragment} documentFragment


Wrong parameters in the docs.

jodator · 2019-07-29T09:33:27Z

src/normalizer.jsdoc

+/**
+ * Method applies normalization to given data.
+ *
+ * @method #exec


I might forgot about it - it should be full form: execute()

jodator · 2019-07-29T11:16:30Z

tests/pastefromoffice.js

+				{
+					'text/html': '<meta name=Generator content="Microsoft Word 15"><p class="MsoNormal">Hello world<o:p></o:p></p>'
+				}
+			].forEach( ( inputData, index ) => {


as above (helper test function vs forEach()

jodator · 2019-07-29T11:19:23Z

tests/normalizer/mswordnormalizer.js

+	describe( 'isActive()', () => {
+		describe( 'correct data set', () => {
+			[
+				'<meta name=Generator content="Microsoft Word 15"><p>Foo bar</p>',


It does not detect the other option - only one form of compatible content.

Compare readability of the tests:

describe( 'isActive()', () => { it( 'should return true for MS Word content', () => { expect( normalizer.isActive( '<meta name=Generator content="Microsoft Word 15">Foo bar' ) ).to.be.true; } ); it( 'should return true for MS Word content - in Safari', () => { expect( normalizer.isActive( '<meta name=Generator content="Microsoft Word 15">Foo bar' ) ).to.be.true; } ); it( 'should return false for non-compatible content', () => { expect( normalizer.isActive( 'Foo bar' ) ).to.be.false; } ); it( 'should return false for content from other source', () => { expect( normalizer.isActive( '' ) ).to.be.false; } ); } );

marking every test with numbers provides little value for others - I have to check what the # was and why it might fail.

Those tests are simple enough and doesn't have to be run in a loop.

It does not detect the other option - only one form of compatible content.

It wasn't present such check in original data. Now it's added :)

jodator · 2019-07-29T11:25:03Z

tests/normalizer/googledocsnormalizer.js

+	const normalizer = new GoogleDocsNormalizer();
+
+	describe( 'isActive()', () => {
+		describe( 'correct data set', () => {


Check the notes about running tests in MSWordNormalizer tests.

tests/filters/space.js

tests/filters/parse.js

Co-Authored-By: Maciej <[email protected]>

Mgsy

I've checked it once again and everything works fine. I didn't find any new unexpected behaviour regarding pasting from Word.

jodator · 2019-07-29T14:16:27Z

tests/pastefromoffice.js

+	// @param {Boolean} shouldBeProcessed determines if data should be marked as processed with isTransformedWithPasteFromOffice flag
+	// @param {Boolean} [isAlreadyProcessed=false] apply flag before paste from office plugin will transform the data object
+	function checkDataProcessing( inputString, shouldBeProcessed, isAlreadyProcessed = false ) {
+		// const htmlDataProcessor = new HtmlDataProcessor();


Remove - left over comment.

jodator · 2019-07-29T14:17:44Z

src/filters/removeboldwrapper.js

+ */
+
+/**
+ * @module paste-from-office/filters/removeboldtagwrapper


wrong module = should be removeboldwrapper.

jodator

Sorry for two out-of-order comments I forgot to start a review with them.

Finishing touches and we are good to go :)

jodator · 2019-07-29T14:19:53Z

tests/pastefromoffice.js

+
+		clipboard.fire( 'inputTransformation', data );
+
+		if ( shouldBeProcessed ) {


Those two methods of testing could be preserved - now the test is a bit too mangled ;) Also two booleans in method parameters are too much.

Mateusz Samsel added 9 commits July 10, 2019 16:53

Add function to determines the source app of the data input.

7a296c4

Add test for google docs, fix unit test creator to fully simulate cli…

53152ab

…pboard event. Extend unit test with information about data source. Some small docs and test naming improvements.

Extrct part of the logic to separate files.

ea75c23

Move utils to plugin to provide better way of testing it.

4e21458

Add unit test from Firefox source, rename test case, simplify getInpu…

60a5e53

…tType method, remove dataSource from autoamtic tests which was confusing.

Unify test suits names.

ac4c635

Remove unnecessary autoamtic test which is cover now by filter test.

f3dbf34

Add clipboard input from Firefox which is different from Chrome. Smal…

1900a9f

…l typo fixes.

Simplify test names.

529b6e3

msamsel marked this pull request as ready for review July 16, 2019 12:50

msamsel requested a review from jodator July 16, 2019 12:50

mlewand requested review from mlewand and Mgsy and removed request for mlewand July 17, 2019 13:43

Mgsy added the status:review- label Jul 18, 2019

Fix bold removal on windows. Add new test cover it. Change tests loca…

f25f38c

…tion.

msamsel removed the status:review- label Jul 18, 2019

jodator suggested changes Jul 19, 2019

View reviewed changes

Mateusz Samsel added 5 commits July 19, 2019 15:16

Introduce concept of normalizers.

4b05a67

Implement normalizers instances for google docs and ms word and extra…

d4805f6

…ct them to separate files. Clean up in contentnormalizer api.

Fix automatic test to use clipbaord event instead of run function man…

ca4b62f

…ually.

Add requires to te plugin.

ffd320b

Provide different test set for paste from office class.

f7e6dbd

Mgsy approved these changes Jul 22, 2019

View reviewed changes

Mateusz Samsel added 3 commits July 22, 2019 16:56

Provide small improvements in normalizer. Add unit test covering norm…

a6a6ad3

…alizers functionalities.

Add tests for msword normalizer, improve content normalizer class.

aa541e4

Add test for google docs normalizer.

24a349c

mlewand requested a review from Mgsy July 29, 2019 07:58

Update normalizer documentation.

ac1c0a1

jodator suggested changes Jul 29, 2019

View reviewed changes

jodator added the status:review- label Jul 29, 2019

msamsel and others added 9 commits July 29, 2019 14:16

Apply suggestions from code review

84762a8

Co-Authored-By: Maciej <[email protected]>

Change namespace of removeBoldTagWrapper filter.

7eaee98

Improve fitler styling.

219aac7

Rename exec to execute.

dd1fa0d

Rename removeBoldTagWraper to removeBoldWrapper.

1b9829c

Fix not renamed exec statement.

614005a

Fix docs description.

11ae6a1

Rename normalizer to normalizers namespace.

8bc14fe

Unify google docs and ms word normalizers.

201e07c

Mgsy approved these changes Jul 29, 2019

View reviewed changes

Mateusz Samsel added 4 commits July 29, 2019 15:11

Extract regexps matching content source to separate constant variables.

b055de3

Replace foreach loop with help funciton in pastefromoffice tests.

1bbdbd1

remove foreach loops in normalizers check.

a81e493

Fix code comments.

a27e2e4

msamsel removed the status:review- label Jul 29, 2019

msamsel requested a review from jodator July 29, 2019 14:08

jodator reviewed Jul 29, 2019

View reviewed changes

jodator suggested changes Jul 29, 2019

View reviewed changes

Mateusz Samsel added 2 commits July 29, 2019 16:37

Remove leftover, correct namespace.

d79a224

Split tests into groups, provide helper for each group.

ee7e3a0

msamsel requested a review from jodator July 29, 2019 15:00

jodator merged commit 8102de3 into master Jul 29, 2019

jodator deleted the t/61 branch July 29, 2019 15:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent of bolding entire content pasted from google docs #62

Prevent of bolding entire content pasted from google docs #62

msamsel commented Jul 16, 2019 •

edited by pomek

Loading

msamsel commented Jul 17, 2019

Mgsy commented Jul 18, 2019

msamsel commented Jul 18, 2019

jodator left a comment

jodator Jul 18, 2019

jodator Jul 18, 2019

jodator Jul 18, 2019

jodator Jul 18, 2019

Mgsy left a comment

mlewand commented Jul 29, 2019

msamsel commented Jul 29, 2019

jodator left a comment

jodator Jul 29, 2019

jodator Jul 29, 2019

jodator Jul 29, 2019

jodator Jul 29, 2019

jodator Jul 29, 2019

jodator Jul 29, 2019

jodator Jul 29, 2019

jodator Jul 29, 2019

msamsel Jul 29, 2019

jodator Jul 29, 2019

Mgsy left a comment

jodator Jul 29, 2019 •

edited

Loading

jodator Jul 29, 2019 •

edited

Loading

jodator left a comment

jodator Jul 29, 2019

	* Note: this function was exposed mainly for testing purposes and should not be called directly.
	* Note: this function is exposed mainly for testing purposes and should not be called directly.


		clipboard.fire( 'inputTransformation', data );

		if ( shouldBeProcessed ) {

Prevent of bolding entire content pasted from google docs #62

Prevent of bolding entire content pasted from google docs #62

Conversation

msamsel commented Jul 16, 2019 • edited by pomek Loading

Suggested merge commit message (convention)

Additional information

msamsel commented Jul 17, 2019

Mgsy commented Jul 18, 2019

msamsel commented Jul 18, 2019

jodator left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Mgsy left a comment

Choose a reason for hiding this comment

mlewand commented Jul 29, 2019

msamsel commented Jul 29, 2019

jodator left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Mgsy left a comment

Choose a reason for hiding this comment

jodator Jul 29, 2019 • edited Loading

Choose a reason for hiding this comment

jodator Jul 29, 2019 • edited Loading

Choose a reason for hiding this comment

jodator left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

msamsel commented Jul 16, 2019 •

edited by pomek

Loading

jodator Jul 29, 2019 •

edited

Loading

jodator Jul 29, 2019 •

edited

Loading