...
Sample input document
Code Block | ||
---|---|---|
| ||
Message-ID: <32220443.1075841552668.JavaMail.evans@thyme>
Date: Mon, 9 Jul 2001 11:33:32 -0700 (PDT)
From: cara.semperger@enron.com
To: will.smith@enron.com
Subject: RE: Testing Preschedule workspace
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: quoted-printable
X-From: Semperger, Cara </O=ENRON/OU=NA/CN=RECIPIENTS/CN=CSEMPER>
X-To: Smith, Will </O=ENRON/OU=NA/CN=RECIPIENTS/CN=Wsmith>
X-cc:
X-bcc:
X-Folder: \ExMerge - Semperger, Cara\Deleted Items
X-Origin: SEMPERGER-C
X-FileName: cara semperger 6-26-02.PST
I am trying to pull it up now, it's taking a long time
-----Original Message-----
From: =09Semperger, Cara =20
Sent:=09Monday, July 09, 2001 10:40 AM
To:=09Smith, Will; Atta, Asem
Cc:=09Bentley, Corry; Poston, David
Subject:=09Testing Preschedule workspace
Good Morning,
My target testing date today is June 18th, I am running in Test P in Local=
Enpower using actual data from our scheduling sheets re-arranged to meet t=
he new guidelines.
The daily deals I coded X in columns J and N, the Month long bookouts and =
BOM bookouts I coded R. =20
What worked:
I was able to retrieve my saved workspace with all data intact. I had previ=
ously sucessfully copied and pasted my entire sheet from EXCEL to the PSW.
I was able to run the build route report with the criteria of "Starting On-=
June 18-PaloVerde-Day of week Mask Activated-Report Changes activated." A =
check of deals actually scheduled vs. build route results showed that all d=
eals were extracted correctly from Enpower. Because I am working on closed =
dates, a cumulative test of this app will not be fully testable until produ=
ction. We are expecting to see the same functionality as the current incarn=
ation of Build route. The data extracted should be read only, time stamped,=
and when run mulitple times additional data should be shown below previous=
ly extracted data. The improvement we are expecting to see is the app shou=
ld not duplicate deal strips on dates that have no physical power flow. (We=
st Light Load currently does this in Start view, but not Active view)
Navigating around the scheduling sheet itself I was able to accurately exec=
ute the sort function on a single criteria at a time. Multiple sorting will=
contunue to be done in excel, or we can do a series of single sorts in the=
PSW to acheive the same result.
Routing deals: Will had deleted all routes for June 18th, starting me with =
a clean slate. I made every path be for DAY. I was unable to confirm total=
unrouted MWH, as the real time position manager does not seem to be functi=
oning in TESTP. The routing appeared to take 19 minutes with the status bar=
showing steady progress during that time. This time is 15-17 minutes longe=
r than current speed using the Excel Macro system we have now. The error li=
st gave me a row by row description of what did not route, a very useful to=
ol. OK was visible on all rows that the PSW believed that it had routed. I=
had difficulty checking the routing results, as I kept getting BDE errors =
in Scheduling after routing had occurred (Local Enpower). Scheduling kept s=
tarting up in 1899. I was unable to login to TestP through Terminal server=
2, but was able to in Terminal Server 5. The results there were very encou=
raging! Most routing was done, and a spot check of deals shows that they we=
re routed properly. The deals that were not routed appear to be due to a us=
er error of deal number duplication in the sheet. This is consistent with w=
hat I would expect. I will further evaluate routing ability with our more c=
omplicated paths later. This routing was very easy, a large point with on p=
eak non shaped deals only.
Things I did not expect that I liked:
When I highlight a group of cells in Build Route, it stays highlighted when=
I move up to the scheduling sheet to highlight a comparison group of cells=
. This is very handy for double checking Build route against the scheduler=
's sheet.
What does not appear to be working at this time:
The physical or not physical flag of path does not seem to be showing up pr=
operly in routing.
Path Confirmation: The running time appeared to be over one hour for one s=
heet, only 70 rows of the sheet being flagged for insertion into confirmati=
on. This current speed will not be sufficient to work in production. Also, =
many rows that were flagged for confirmation were not imported, and I canno=
t find an error log to help understand why deals were not imported to path =
confirmation.
When the path confirmation task was finished, the application simply froze=
. The status bar was no longer visible, leading me to believe that it was =
done, however the app was not able to be saved or closed or examined furthe=
r.
My conclusions:
The build route and routing functions work well enough to use in production=
, the copy-paste function works well for the West desk per our connectivity=
issues.
Path Confirmation is not functioning at this point, and appears to be blowi=
ng up the app. No data was visible for June 18th even after the PSW ran thr=
ough its import function.
Please let me know when the issues I have named have been addressed and are=
ready for further testing.
Thanks
Cara
503/464-3814 |
Source
Code Block | ||
---|---|---|
| ||
{
"description": "All of the Enron emails corpus with TextRank keyword extraction enabled.",
"extractType": "File",
"file": {
"domain": "DOMAIN",
"password": "PASSWORD",
"username": "USER"
},
"isPublic": true,
"mediaType": "Email",
"searchCycle_secs": -1,
"structuredAnalysis": {
"associations": [
{
"associations": [
{
"assoc_type": "Event",
"entity1": "$SCRIPT( return _doc.metadata._FILE_METADATA_[0].metadata.Author[0];)",
"entity2": "$SCRIPT(return _value;)",
"iterateOver": "Message-To",
"time_start": "$SCRIPT( return _doc.publishedDate;)",
"verb": "emailed",
"verb_category": "emailed/communicated"
}
],
"iterateOver": "email_meta"
}
],
"entities": [
{
"dimension": "What",
"disambiguated_name": "$SCRIPT( return _doc.metadata._FILE_METADATA_[0].metadata.Author[0];)",
"type": "Account",
"useDocGeo": false
},
{
"entities": [
{
"dimension": "What",
"disambiguated_name": "",
"iterateOver": "Message-To",
"type": "Account",
"useDocGeo": false
}
],
"iterateOver": "email_meta"
}
],
"scriptEngine": "JavaScript",
"title": "$SCRIPT( return _doc.metadata._FILE_METADATA_[0].metadata.subject[0];)"
},
"tags": [
"enron",
"email",
"fraud"
],
"title": "All Enron Emails (TextRank)",
"unstructuredAnalysis": {
"meta": [
{
"context": "All",
"fieldName": "email_meta",
"flags": "m",
"script": "var x=_metadata._FILE_METADATA_[0].metadata;x;",
"scriptlang": "javascript"
}
],
"simpleTextCleanser": [
{
"field": "fullText",
"flags": "md",
"replacement": " ",
"script": "(?:\\[.*?\\])",
"scriptlang": "regex"
},
{
"field": "description",
"flags": "md",
"replacement": " ",
"script": "(?:\\[.*?\\])",
"scriptlang": "regex"
},
{
"field": "fullText",
"flags": "md",
"replacement": ". ",
"script": "<.*?>",
"scriptlang": "regex"
},
{
"field": "description",
"flags": "md",
"replacement": ". ",
"script": "<.*?>",
"scriptlang": "regex"
},
{
"field": "fullText",
"flags": "md",
"replacement": ". ",
"script": "(?:>|<)",
"scriptlang": "regex"
},
{
"field": "description",
"flags": "md",
"replacement": ". ",
"script": "(?:>|<)",
"scriptlang": "regex"
},
{
"field": "fullText",
"replacement": " ",
"script": "(?:[-]{4,}(.*[-]{4,}|\\n))",
"scriptlang": "regex"
},
{
"field": "description",
"replacement": " ",
"script": "(?:[-]{4,}(.*[-]{4,}|\\n))",
"scriptlang": "regex"
},
{
"field": "fullText",
"replacement": " ",
"script": "(?:\\*{2,})",
"scriptlang": "regex"
},
{
"field": "description",
"replacement": " ",
"script": "(?:\\*{2,})",
"scriptlang": "regex"
}
]
},
"url": "smb://modus:139/enron/enron_mail_20110402/maildir/",
"useExtractor": "textrank",
"useTextExtractor": "none"
} |
Sample output document
Code Block | ||
---|---|---|
| ||
{
"_id": "5048efb0e4b01fd6455420ee",
"title": "RE: Testing Preschedule workspace",
"url": "smb://modus:139/enron/testing/semperger-c/deleted_items/37QTKE~3",
"created": "Sep 6, 2012 06:42:01 PM UTC",
"modified": "Jul 24, 2012 01:13:02 AM UTC",
"publishedDate": "Jul 9, 2001 06:33:32 PM UTC",
"source": [
"Enron Emails (TextRank)"
],
"sourceKey": [
"modus.139.enron.testing.."
],
"mediaType": [
"Email"
],
"description": "I am trying to pull it up now, it's taking a long time\r\n\r\n \r\nFrom: \tSmith, Will \r\nSent:\tMonday, July 09, 2001 11:28 AM\r\nTo:\tSemperger, Cara\r\nSubject:\tRE: Testing Preschedule workspace\r\n\r\nYes, but Vish made the changes in Table Edit. : - )\r\n\r\nWill\r\n\r\n \r\nFrom: \tSemperger, Cara \r\nSent:\tMonday, July 09, 2001 1:20 PM\r\nTo:\tSmith, Will\r\nSubject:\tRE: Testing Preschedule workspace\r\n\r\nSo, this table edit that Brett is asking me to test is really from ",
"entities": [
{
"disambiguated_name": "on- june 18-paloverde-day",
"index": "on- june 18-paloverde-day/keyword",
"actual_name": "on- june 18-paloverde-day",
"type": "Keyword",
"relevance": 0.10585404743253149,
"frequency": 1,
"totalfrequency": 12,
"doccount": 12,
"dimension": "What"
},
{
"disambiguated_name": "mulitple times additional data",
"index": "mulitple times additional data/keyword",
"actual_name": "mulitple times additional data",
"type": "Keyword",
"relevance": 0.18088061045762382,
"frequency": 1,
"totalfrequency": 12,
"doccount": 12,
"dimension": "What"
},
{
"disambiguated_name": "scheduling sheets",
"index": "scheduling sheets/keyword",
"actual_name": "scheduling sheets",
"type": "Keyword",
"relevance": 0.15086086188384693,
"frequency": 1,
"totalfrequency": 20,
"doccount": 20,
"dimension": "What"
},
{
"disambiguated_name": "app",
"index": "app/keyword",
"actual_name": "app",
"type": "Keyword",
"relevance": 0.20415634782171557,
"frequency": 1,
"totalfrequency": 58,
"doccount": 58,
"dimension": "What"
},
{
"disambiguated_name": "data",
"index": "data/keyword",
"actual_name": "data",
"type": "Keyword",
"relevance": 0.1361375118885727,
"frequency": 1,
"totalfrequency": 3323,
"doccount": 3323,
"dimension": "What"
},
{
"disambiguated_name": "paths",
"index": "paths/keyword",
"actual_name": "paths",
"type": "Keyword",
"relevance": 0.2041916488834702,
"frequency": 1,
"totalfrequency": 99,
"doccount": 99,
"dimension": "What"
},
{
"disambiguated_name": "build route report",
"index": "build route report/keyword",
"actual_name": "build route report",
"type": "Keyword",
"relevance": 0.11476307758997932,
"frequency": 1,
"totalfrequency": 36,
"doccount": 36,
"dimension": "What"
},
{
"disambiguated_name": "testing preschedule workspace cara",
"index": "testing preschedule workspace cara/keyword",
"actual_name": "testing preschedule workspace cara",
"type": "Keyword",
"relevance": 0.16803833041631702,
"frequency": 1,
"totalfrequency": 8,
"doccount": 8,
"dimension": "What"
},
{
"disambiguated_name": "physical power flow",
"index": "physical power flow/keyword",
"actual_name": "physical power flow",
"type": "Keyword",
"relevance": 0.11805512187037151,
"frequency": 1,
"totalfrequency": 17,
"doccount": 17,
"dimension": "What"
},
{
"disambiguated_name": "i",
"index": "i/keyword",
"actual_name": "i",
"type": "Keyword",
"relevance": 0.13651904141534263,
"frequency": 1,
"totalfrequency": 18162,
"doccount": 18162,
"dimension": "What"
},
{
"disambiguated_name": "total running time",
"index": "total running time/keyword",
"actual_name": "total running time",
"type": "Keyword",
"relevance": 0.11233232851584997,
"frequency": 1,
"totalfrequency": 10,
"doccount": 10,
"dimension": "What"
},
{
"disambiguated_name": "time",
"index": "time/keyword",
"actual_name": "time",
"type": "Keyword",
"relevance": 0.34020922533185516,
"frequency": 1,
"totalfrequency": 17102,
"doccount": 17102,
"dimension": "What"
},
{
"disambiguated_name": "psw",
"index": "psw/keyword",
"actual_name": "psw",
"type": "Keyword",
"relevance": 0.13625985262266815,
"frequency": 1,
"totalfrequency": 46,
"doccount": 46,
"dimension": "What"
},
{
"disambiguated_name": "semperger",
"index": "semperger/keyword",
"actual_name": "semperger",
"type": "Keyword",
"relevance": 0.2724417241053495,
"frequency": 1,
"totalfrequency": 226,
"doccount": 226,
"dimension": "What"
},
{
"disambiguated_name": "peak non shaped deals",
"index": "peak non shaped deals/keyword",
"actual_name": "peak non shaped deals",
"type": "Keyword",
"relevance": 0.19127581970645322,
"frequency": 1,
"totalfrequency": 12,
"doccount": 12,
"dimension": "What"
},
{
"disambiguated_name": "table edit",
"index": "table edit/keyword",
"actual_name": "table edit",
"type": "Keyword",
"relevance": 0.21207334129182112,
"frequency": 1,
"totalfrequency": 32,
"doccount": 32,
"dimension": "What"
},
{
"disambiguated_name": "week mask activated-report changes",
"index": "week mask activated-report changes/keyword",
"actual_name": "week mask activated-report changes",
"type": "Keyword",
"relevance": 0.1484580867667756,
"frequency": 1,
"totalfrequency": 12,
"doccount": 12,
"dimension": "What"
},
{
"disambiguated_name": "excel macro system",
"index": "excel macro system/keyword",
"actual_name": "excel macro system",
"type": "Keyword",
"relevance": 0.12208201691477336,
"frequency": 1,
"totalfrequency": 12,
"doccount": 12,
"dimension": "What"
},
{
"disambiguated_name": "real time position manager",
"index": "real time position manager/keyword",
"actual_name": "real time position manager",
"type": "Keyword",
"relevance": 0.19213464212989614,
"frequency": 1,
"totalfrequency": 39,
"doccount": 39,
"dimension": "What"
},
{
"disambiguated_name": "testing preschedule workspace",
"index": "testing preschedule workspace/keyword",
"actual_name": "testing preschedule workspace",
"type": "Keyword",
"relevance": 0.17652180791002264,
"frequency": 1,
"totalfrequency": 12,
"doccount": 12,
"dimension": "What"
},
{
"disambiguated_name": "cara",
"index": "cara/keyword",
"actual_name": "cara",
"type": "Keyword",
"relevance": 0.20414801224595303,
"frequency": 1,
"totalfrequency": 736,
"doccount": 736,
"dimension": "What"
},
{
"disambiguated_name": "smith",
"index": "smith/keyword",
"actual_name": "smith",
"type": "Keyword",
"relevance": 0.27217844252943296,
"frequency": 1,
"totalfrequency": 783,
"doccount": 783,
"dimension": "What"
},
{
"disambiguated_name": "david subject",
"index": "david subject/keyword",
"actual_name": "david subject",
"type": "Keyword",
"relevance": 0.15139765579194864,
"frequency": 1,
"totalfrequency": 930,
"doccount": 930,
"dimension": "What"
},
{
"disambiguated_name": "sheet",
"index": "sheet/keyword",
"actual_name": "sheet",
"type": "Keyword",
"relevance": 0.20416968108320477,
"frequency": 1,
"totalfrequency": 436,
"doccount": 436,
"dimension": "What"
},
{
"disambiguated_name": "total unrouted mwh",
"index": "total unrouted mwh/keyword",
"actual_name": "total unrouted mwh",
"type": "Keyword",
"relevance": 0.1141385057566826,
"frequency": 1,
"totalfrequency": 16,
"doccount": 16,
"dimension": "What"
},
{
"disambiguated_name": "target testing date today",
"index": "target testing date today/keyword",
"actual_name": "target testing date today",
"type": "Keyword",
"relevance": 0.18726422286448255,
"frequency": 1,
"totalfrequency": 12,
"doccount": 12,
"dimension": "What"
},
{
"disambiguated_name": "deals",
"index": "deals/keyword",
"actual_name": "deals",
"type": "Keyword",
"relevance": 0.34025706056156424,
"frequency": 1,
"totalfrequency": 5740,
"doccount": 5261,
"dimension": "What"
},
{
"disambiguated_name": "double checking build route",
"index": "double checking build route/keyword",
"actual_name": "double checking build route",
"type": "Keyword",
"relevance": 0.18886230001363824,
"frequency": 1,
"totalfrequency": 12,
"doccount": 12,
"dimension": "What"
},
{
"disambiguated_name": "path confirmation task",
"index": "path confirmation task/keyword",
"actual_name": "path confirmation task",
"type": "Keyword",
"relevance": 0.12326679747563907,
"frequency": 1,
"totalfrequency": 16,
"doccount": 16,
"dimension": "What"
},
{
"disambiguated_name": "routes",
"index": "routes/keyword",
"actual_name": "routes",
"type": "Keyword",
"relevance": 0.40825322818399834,
"frequency": 1,
"totalfrequency": 142,
"doccount": 142,
"dimension": "What"
},
{
"disambiguated_name": "west light load",
"index": "west light load/keyword",
"actual_name": "west light load",
"type": "Keyword",
"relevance": 0.11288042191103252,
"frequency": 1,
"totalfrequency": 16,
"doccount": 16,
"dimension": "What"
},
{
"disambiguated_name": "rows",
"index": "rows/keyword",
"actual_name": "rows",
"type": "Keyword",
"relevance": 0.2721919612854695,
"frequency": 1,
"totalfrequency": 72,
"doccount": 72,
"dimension": "What"
},
{
"disambiguated_name": "path confirmation",
"index": "path confirmation/keyword",
"actual_name": "path confirmation",
"type": "Keyword",
"relevance": 0.2124247462661659,
"frequency": 1,
"totalfrequency": 169,
"doccount": 169,
"dimension": "What"
},
{
"disambiguated_name": "month long bookouts",
"index": "month long bookouts/keyword",
"actual_name": "month long bookouts",
"type": "Keyword",
"relevance": 0.12514486683483175,
"frequency": 1,
"totalfrequency": 18,
"doccount": 18,
"dimension": "What"
},
{
"disambiguated_name": "deal number duplication",
"index": "deal number duplication/keyword",
"actual_name": "deal number duplication",
"type": "Keyword",
"relevance": 0.12910499876653425,
"frequency": 1,
"totalfrequency": 16,
"doccount": 16,
"dimension": "What"
},
{
"disambiguated_name": "minutes",
"index": "minutes/keyword",
"actual_name": "minutes",
"type": "Keyword",
"relevance": 0.13613482658399254,
"frequency": 1,
"totalfrequency": 1234,
"doccount": 1172,
"dimension": "What"
},
{
"disambiguated_name": "cara.semperger@enron.com",
"index": "cara.semperger@enron.com/account",
"actual_name": "cara.semperger@enron.com",
"type": "Account",
"relevance": 0,
"frequency": 1,
"totalfrequency": 3251,
"doccount": 3251,
"dimension": "What"
},
{
"disambiguated_name": "will.smith@enron.com",
"index": "will.smith@enron.com/account",
"actual_name": "will.smith@enron.com",
"type": "Account",
"relevance": 0,
"frequency": 1,
"totalfrequency": 408,
"doccount": 408,
"dimension": "What"
}
],
"tags": [
"enron",
"email",
"fraud"
],
"communityId": [
"500df237e4b00e332fe993aa"
],
"associations": [
{
"entity1": "cara.semperger@enron.com",
"entity1_index": "cara.semperger@enron.com/account",
"verb": "emailed",
"verb_category": "emailed/communicated",
"entity2": "will.smith@enron.com",
"entity2_index": "will.smith@enron.com/account",
"time_start": "2001-07-09T14:33:32",
"assoc_type": "Event"
}
],
"metadata": {
"_FILE_METADATA_": [
[
{
"metadata": {
"Creation-Date": [
"2001-07-09T18:33:32Z"
],
"subject": [
"RE: Testing Preschedule workspace"
],
"Message-From": [
"cara.semperger@enron.com"
],
"Author": [
"cara.semperger@enron.com"
],
"Message-To": [
"will.smith@enron.com"
],
"date": [
"2001-07-09T18:33:32Z"
],
"Content-Type": [
"message/rfc822"
]
}
}
]
],
"email_meta": [
[
{
"Creation-Date": [
"2001-07-09T18:33:32Z"
],
"Message-To": [
"will.smith@enron.com"
],
"Content-Type": [
"message/rfc822"
],
"subject": [
"RE: Testing Preschedule workspace"
],
"date": [
"2001-07-09T18:33:32Z"
],
"Author": [
"cara.semperger@enron.com"
],
"Message-From": [
"cara.semperger@enron.com"
]
}
]
]
}
} |