forked from ericwhyne/xdata_meta
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdata.json
928 lines (927 loc) · 64.1 KB
/
data.json
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
[
{
"XData Team":"Aptima Inc.",
"Software":"Network Query by Example",
"Internal Link":"",
"External Link":"https://github.com/Aptima/pattern-matching",
"Public Code Repo":"https://github.com/Aptima/pattern-matching.git",
"Stats":"pattern-matching",
"Characteristics of the Software (what software does)":"Hadoop MapReduce-over-Hive based implementation of network query by example utilizing attributed network pattern matching.",
"Xdata Git Location":"",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"Boeing/Pitt",
"Software":"SMILE-WIDE: A scalable Bayesian network library",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/ANL/Boeing+-+Pitt",
"External Link":"http://smilewide.github.io/main/",
"Public Code Repo":"https://github.com/SmileWide/main.git",
"Stats":"main",
"Characteristics of the Software (what software does)":"SMILE-WIDE is a scalable Bayesian network library. Initially, it is a version of the SMILE library, as in SMILE With Integrated Distributed Execution. The general approach has been to provide an API similar to the existing API SMILE developers use to build \"local,\" single-threaded applications. However, we provide \"vectorized\" operations that hide a Hadoop-distributed implementation. Apart from invoking a few idioms like generic Hadoop command line argument parsing, these appear to the developer as if they were executed locally.",
"Xdata Git Location":"tools\\analytics\\boeing-pitt\\smile-wide-final-summercamp",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"Carnegie Mellon University",
"Software":"Support Distribution Machines ",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/pages/viewpage.action?pageId=6717624",
"External Link":"https://github.com/dougalsutherland/py-sdm",
"Public Code Repo":"https://github.com/dougalsutherland/py-sdm.git",
"Stats":"py-sdm",
"Characteristics of the Software (what software does)":"Python implementation of the nonparametric divergence estimators described by Barnabas Poczos, Liang Xiong, Jeff Schneider (2011). Nonparametric divergence estimation with applications to machine learning on distributions. Uncertainty in Artificial Intelligence. ( http://autonlab.org/autonweb/20287.html ) and also their use in support vector machines, as described by Dougal J. Sutherland, Liang Xiong, Barnabas Poczos, Jeff Schneider (2012). Kernels on Sample Sets via Nonparametric Divergence Estimates. ( http://arxiv.org/abs/1202.0302 ).",
"Xdata Git Location":"tools\\analytics\\cmu",
"License":"BSD",
"Language (Primary)":"Python",
"Language (Secondary)":"",
"Category":"Analytics"
},
{
"XData Team":"Continuum Analytics",
"Software":"Blaze",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/VIS/Continuum+Analytics+Home+Page",
"External Link":"https://github.com/ContinuumIO/blaze",
"Public Code Repo":"https://github.com/ContinuumIO/blaze.git",
"Stats":"blaze",
"Characteristics of the Software (what software does)":"Blaze is the next-generation of NumPy. It is designed as a foundational set of abstractions on which to build out-of-core and distributed algorithms over a wide variety of data sources and to extend the structure of NumPy itself. Blaze allows easy composition of low level computation kernels (C, Fortran, Numba) to form complex data transformations on large datasets. In Blaze, computations are described in a high-level language (Python) but executed on a low-level runtime (outside of Python), enabling the easy mapping of high-level expertise to data without sacrificing low-level performance. Blaze aims to bring Python and NumPy into the massively-multicore arena, allowing it to leverage many CPU and GPU cores across computers, virtual machines and cloud services.",
"Xdata Git Location":"tools\\analytics\\continuum\\blaze",
"License":"BSD",
"Language (Primary)":"Python",
"Language (Secondary)":"",
"Category":"Infrastructure"
},
{
"XData Team":"Continuum Analytics",
"Software":"Numba",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/VIS/Continuum+Analytics+Home+Page",
"External Link":"https://github.com/numba/numba",
"Public Code Repo":"https://github.com/numba/numba.git",
"Stats":"numba",
"Characteristics of the Software (what software does)":"Numba is an Open Source NumPy-aware optimizing compiler for Python sponsored by Continuum Analytics, Inc. It uses the LLVM compiler infrastructure to compile Python syntax to machine code.<br/><br/>It is aware of NumPy arrays as typed memory regions and so can speed-up code using NumPy arrays. Other, less well-typed code is translated to Python C-API calls effectively removing the \"interpreter\" but not removing the dynamic indirection.<br/><br/>Numba is also not a tracing just in time (JIT) compiler. It compiles your code before it runs either using run-time type information or type information you provide in the decorator.<br/><br/>Numba is a mechanism for producing machine code from Python syntax and typed data structures such as those that exist in NumPy.",
"Xdata Git Location":"tools\\analytics\\continuum\\numba",
"License":"BSD",
"Language (Primary)":"Python",
"Language (Secondary)":"",
"Category":"Infrastructure"
},
{
"XData Team":"Continuum Analytics",
"Software":"Bokeh",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/VIS/Continuum+Analytics+Home+Page",
"External Link":"http://bokeh.pydata.org",
"Public Code Repo":"https://github.com/ContinuumIO/bokeh.git",
"Stats":"bokeh",
"Characteristics of the Software (what software does)":"Bokeh (pronounced bo-Kay or bo-Kuh) is a Python interactive visualization library for large datasets that natively uses the latest web technologies. Its goal is to provide elegant, concise construction of novel graphics in the style of Protovis/D3, while delivering high-performance interactivity over large data to thin clients.",
"Xdata Git Location":"tools/visualization/continuum/bokeh; tools/visualization/continuum/bokehjs",
"License":"BSD",
"Language (Primary)":"Python",
"Language (Secondary)":"Javascript, Coffeescript",
"Category":"Visualization"
},
{
"XData Team":"Continuum Analytics and Indiana University",
"Software":"Abstract Rendering",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/VIS/Continuum+Analytics+Home+Page",
"External Link":"http://www.github.com/JosephCottam/AbstractRendering",
"Public Code Repo":"https://github.com/JosephCottam/AbstractRendering.git",
"Stats":"AbstractRendering",
"Characteristics of the Software (what software does)":"Information visualization rests on the idea that a meaningful relationship can be drawn between pixels and data. This is most often mediated by geometric entities (such as circles, squares and text) but always involves pixels eventually to display. In most systems, the pixels are tucked away under levels of abstraction in the rendering system. Abstract Rendering takes the opposite approach: expose the pixels and gain powerful pixel-level control. This pixel-level power is a complement to many existing visualization techniques. It is an elaboration on rendering, not an analytic or projection step, so it can be used as an epilogue to many existing techniques. In standard rendering, geometric objects are projected to an image and represented on that image's discrete pixels. The source space is an abstract canvas that contains logically continuous geometric primitives and the target space is an image that contains discrete colors. Abstract Rendering fits between these two states. It introduces a discretization of the data at the pixel-level, but not necessarily all the way to colors. This enables many pixel-level concerns to be efficiently and concisely captured.",
"Xdata Git Location":"/tools/visualization/continuum/AbstractRendering/",
"License":"BSD",
"Language (Primary)":"Java",
"Language (Secondary)":"",
"Category":"Visualization"
},
{
"XData Team":"Continuum Analytics",
"Software":"CDX",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/VIS/Continuum+Analytics+Home+Page",
"External Link":"https://github.com/ContinuumIO/cdx",
"Public Code Repo":"https://github.com/ContinuumIO/cdx.git",
"Stats":"cdx",
"Characteristics of the Software (what software does)":"Software to visualize the structure of large or complex datasets / produce guides that help users or algorithms gauge the quality of various kinds of graphs & plots.",
"Xdata Git Location":"/tools/visualization/continuum/",
"License":"BSD",
"Language (Primary)":"Javascript",
"Language (Secondary)":"Coffeescript, Python",
"Category":"Visualization"
},
{
"XData Team":"Continuum Analytics and Indiana University",
"Software":"Stencil",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/VIS/Continuum+Analytics+Home+Page",
"External Link":"https://github.com/JosephCottam/Stencil",
"Public Code Repo":"https://github.com/JosephCottam/Stencil.git",
"Stats":"Stencil",
"Characteristics of the Software (what software does)":"Stencil is a grammar-based approach to visualization specification at a higher-level.",
"Xdata Git Location":"tools\\visualizations\\continuum\\Stencil",
"License":"BSD",
"Language (Primary)":"Clojure",
"Language (Secondary)":"",
"Category":"Visualization"
},
{
"XData Team":"Data Tactics Corporation",
"Software":"Vowpal Wabbit",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/ANL/Cloud+Analytics+for+Structure+Extractions+%28CASE%29+Group",
"External Link":"https://github.com/JohnLangford/vowpal_wabbit",
"Public Code Repo":"https://github.com/JohnLangford/vowpal_wabbit.git",
"Stats":"vowpal_wabbit",
"Characteristics of the Software (what software does)":"The Vowpal Wabbit (VW) project is a fast out-of-core learning system sponsored by Microsoft Research and (previously) Yahoo! Research. Support is available through the mailing list. There are two ways to have a fast learning algorithm: (a) start with a slow algorithm and speed it up, or (b) build an intrinsically fast learning algorithm. This project is about approach (b), and it's reached a state where it may be useful to others as a platform for research and experimentation. There are several optimization algorithms available with the baseline being sparse gradient descent (GD) on a loss function (several are available). The code should be easily usable. Its only external dependence is on the boost library, which is often installed by default.",
"Xdata Git Location":"tools\\analytics\\dt",
"License":"BSD",
"Language (Primary)":"C",
"Language (Secondary)":"",
"Category":"Analytics"
},
{
"XData Team":"Data Tactics Corporation",
"Software":"Circuit",
"Internal Link":"Pending",
"External Link":"http://www.gocircuit.org/",
"Public Code Repo":"https://code.google.com/p/gocircuit/source/checkout",
"Stats":"",
"Characteristics of the Software (what software does)":"Go Circuit reduces the human development and sustenance costs of complex massively-scaled systems nearly to the level of their single-process counterparts. It is a combination of proven ideas from the Erlang ecosystem of distributed embedded devices and Go's ecosystem of Internet application development. Go Circuit extends the reach of Go's linguistic environment to multi-host/multi-process applications.",
"Xdata Git Location":"",
"License":"ALv2",
"Language (Primary)":"",
"Language (Secondary)":"",
"Category":"Infrastructure"
},
{
"XData Team":"Georgia Tech / GTRI",
"Software":"libNMF: a high-performance library for nonnegative matrix factorization and hierarchical clustering",
"Software (Short)":"libNMF",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/pages/viewpage.action?pageId=7274597",
"External Link":"http://www.cc.gatech.edu/~hpark/",
"Public Code Repo":"Pending",
"Stats":"",
"Characteristics of the Software (what software does)":"LibNMF is a high-performance, parallel library for nonnegative matrix factorization on both dense and sparse matrices written in C++. Implementations of several different NMF algorithms are provided, including multiplicative updating, hierarchical alternating least squares, nonnegative least squares with block principal pivoting, and a new rank2 algorithm. The library provides an implementation of hierarchical clustering based on the rank2 NMF algorithm.",
"Xdata Git Location":"tools\\analytics\\gatech\\_x000D_\n",
"License":"ALv2",
"Language (Primary)":"",
"Language (Secondary)":"",
"Category":"Analytics"
},
{
"XData Team":"IBM Research",
"Software":"SKYLARK: Randomized Numerical Linear Algebra and ML",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/pages/viewpage.action?pageId=2752591",
"External Link":"http://xdata-skylark.github.io/",
"Public Code Repo":"2014-05-15",
"Stats":"",
"Characteristics of the Software (what software does)":"SKYLARK implements Numerical Linear Algebra (NLA) kernels based on sketching for distributed computing platforms. Sketching reduces dimensionality through randomization, and includes Johnson-Lindenstrauss random projection (JL); a faster version of JL based on fast transform techniques; sparse techniques that can be applied in time proportional to the number of nonzero matrix entries; and methods for approximating kernel functions and Gram matrices arising in nonlinear statistical modeling problems. We have a library of such sketching techniques, built using MPI in C++ and callable from Python, and are applying the library to regression, low-rank approximation, and kernel-based machine learning tasks, among other problems. ",
"Xdata Git Location":"tools\\analytics\\ibm\\skylark",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"Institute for Creative Technologies / USC",
"Software":"Immersive Body-Based Interactions",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/pages/viewpage.action?pageId=7274668",
"External Link":"http://ict.usc.edu/",
"Public Code Repo":"http://code.google.com/p/svnmimir/source/checkout",
"Stats":"immersive_body-based_interactions",
"Characteristics of the Software (what software does)":"Provides innovative interaction techniques to address human-computer interaction challenges posed by Big Data. Examples include:<br>* Wiggle Interaction Technique: user induced motion to speed visual search.<br>* Immersive Tablet Based Viewers: low cost 3D virtual reality fly-through's of data sets.<br>* Multi-touch interfaces: browsing/querying multi-attribute and geospatial data, hosted by SOLR.<br>* Tablet based visualization controller: eye-free rapid interaction with visualizations.",
"Xdata Git Location":"tools\\visualizations\\usc-ict",
"License":"ALv2",
"Language (Primary)":"",
"Language (Secondary)":"",
"Category":"Visualization"
},
{
"XData Team":"Johns Hopkins University",
"Software":"igraph",
"Software (Short)":"igraph",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/ANL/Johns+Hopkins+University",
"External Link":"https://github.com/igraph/xdata-igraph/",
"Public Code Repo":"https://github.com/igraph/xdata-igraph.git",
"Stats":"xdata-igraph",
"Characteristics of the Software (what software does)":"igraph provides a fast generation of large graphs, fast approximate computation of local graph invariants, fast parallelizable graph embedding. API and Web-service for batch processing graphs across formats.",
"Xdata Git Location":"tools\\analytics\\jhu",
"License":"GPLv2",
"Language (Primary)":"",
"Language (Secondary)":"",
"Category":"Analytics"
},
{
"XData Team":"Trifacta (Stanford, University of Washington, Kitware, Inc. Team)",
"Software":"Vega",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/VIS/Visual+Design+Environment-Kitware+Home",
"External Link":"https://github.com/trifacta/vega",
"Stats":"vega",
"Public Code Repo":"https://github.com/trifacta/vega.git",
"Characteristics of the Software (what software does)":"Vega is a visualization grammar, a declarative format for creating and saving visualization designs. With Vega you can describe data visualizations in a JSON format, and generate interactive views using either HTML5 Canvas or SVG.",
"Xdata Git Location":"tools\\visualizations\\kitware\\vega",
"License":"BSD",
"Language (Primary)":"Javascript",
"Language (Secondary)":"",
"Category":"Visualization"
},
{
"XData Team":"Kitware, Inc.",
"Software":"Tangelo",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/VIS/Visual+Design+Environment-Kitware+Home",
"External Link":"http://kitware.github.io/tangelo/",
"Public Code Repo":"https://github.com/Kitware/tangelo.git",
"Stats":"tangelo",
"Characteristics of the Software (what software does)":"Tangelo provides a flexible HTML5 web server architecture that cleanly separates your web applications (pure Javascript, HTML, and CSS) and web services (pure Python). This software is bundled with some great tools to get you started.",
"Xdata Git Location":"tools\\visualizations\\kitware\\tangelo; tools\\visualizations\\kitware\\xdata-apps",
"License":"ALv2",
"Language (Primary)":"Javascript",
"Language (Secondary)":"Python",
"Category":"Visualization"
},
{
"XData Team":"Harvard and Kitware, Inc.",
"Software":"LineUp",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/VIS/Visual+Design+Environment-Kitware+Home",
"External Link":"http://sgratzl.github.io/paper-2013-lineup/",
"Public Code Repo":"https://github.com/Caleydo/org.caleydo.vis.lineup.demos.git",
"Stats":"org.caleydo.vis.lineup.demos",
"Characteristics of the Software (what software does)":"LineUp is a novel and scalable visualization technique that uses bar charts. This interactive technique supports the ranking of items based on multiple heterogeneous attributes with different scales and semantics. It enables users to interactively combine attributes and flexibly refine parameters to explore the effect of changes in the attribute combination. This process can be employed to derive actionable insights as to which attributes of an item need to be modified in order for its rank to change. Additionally, through integration of slope graphs, LineUp can also be used to compare multiple alternative rankings on the same set of items, for example, over time or across different attribute combinations. We evaluate the effectiveness of the proposed multi-attribute visualization technique in a qualitative study. The study shows that users are able to successfully solve complex ranking tasks in a short period of time.",
"Xdata Git Location":"tools\\visualizations\\kitware\\xdata-apps",
"License":"BSD",
"Category":"Visualization"
},
{
"XData Team":"Harvard and Kitware, Inc.",
"Software":"LineUp Web",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/VIS/Visual+Design+Environment-Kitware+Home",
"External Link":"http://sgratzl.github.io/paper-2013-lineup/",
"Public Code Repo":"2014-06",
"Stats":"",
"Characteristics of the Software (what software does)":"LineUpWeb is the web version of the novel and scalable visualization technique. This interactive technique supports the ranking of items based on multiple heterogeneous attributes with different scales and semantics. It enables users to interactively combine attributes and flexibly refine parameters to explore the effect of changes in the attribute combination.",
"Xdata Git Location":"tools\\visualizations\\kitware\\xdata-apps",
"License":"BSD",
"Category":"Visualization"
},
{
"XData Team":"The New School",
"Software":"Visualization Widgets",
"External Link":"https://github.com/piim/xdata-visualization-widgets",
"Public Code Repo":"https://github.com/piim/xdata-visualization-widgets.git",
"Stats":"",
"Characteristics of the Software (what software does)":"These visualizations were created to demonstrate the type of standalone visualization widgets that might compliment a composite dashboard display for a descision-maker. They are built using D3 and leverage relevant APIs to show the latest available data.",
"Xdata Git Location":"tools\\visualizations\\new-school",
"License":"ALv2",
"Language (Primary)":"Javascript",
"Category":"Visualization"
},
{
"XData Team":"Stanford, University of Washington, Kitware, Inc.",
"Software":"Lyra",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/VIS/Visual+Design+Environment-Kitware+Home",
"External Link":"http://idl.cs.washington.edu/projects/lyra",
"Public Code Repo":"2014-02",
"Stats":"",
"Characteristics of the Software (what software does)":"Lyra is an interactive environment that makes custom visualization design accessible to a broader audience. With Lyra, designers map data to the properties of graphical marks to author expressive visualization designs without writing code. Marks can be moved, rotated and resized using handles; relatively positioned using connectors; and parameterized by data fields using property drop zones. Lyra also provides a data pipeline interface for iterative, visual specification of data transformations and layout algorithms. Visualizations created with Lyra are represented as specifications in Vega, a declarative visualization grammar that enables sharing and reuse.",
"Xdata Git Location":"tools\\visualizations\\kitware\\xdata-apps",
"License":"BSD",
"Category":"Visualization"
},
{
"XData Team":"Phronesis",
"Software":"stat_agg",
"Internal Link":"",
"External Link":"https://github.com/kaneplusplus/stat_agg",
"Public Code Repo":"https://github.com/kaneplusplus/stat_agg.git",
"Stats":"stat_agg",
"Characteristics of the Software (what software does)":"stat_agg is a Python package that provides statistical aggregators that maximize ensemble prediction accuracy by weighting individual learners in an optimal way. When used with the laputa package, learners may be distributed across a cluster of machines. The package also provides fault-tolerance when one or more learners becomes unavailable.",
"Xdata Git Location":"",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"Phronesis",
"Software":"flexmem",
"Internal Link":"",
"External Link":"https://github.com/kaneplusplus/flexmem",
"Public Code Repo":"https://github.com/kaneplusplus/flexmem.git",
"Stats":"flexmem",
"Characteristics of the Software (what software does)":"Flexmem is a general, transparent tool for out-of-core (OOC) computing in the R programming environment. It is launched as a command line utility, taking an application as an argument. All memory allocations larger than a specified threshold are memory-mapped to a binary file. When data are not needed, they are stored on disk. It is both process- and thread-safe.",
"Xdata Git Location":"",
"License":"ALv2",
"Category":"Infrastructure"
},
{
"XData Team":"Phronesis",
"Software":"laputa",
"Internal Link":"",
"External Link":"https://github.com/kaneplusplus/laputa",
"Public Code Repo":"https://github.com/kaneplusplus/laputa.git",
"Stats":"laputa",
"Characteristics of the Software (what software does)":"Laputa is a Python package that provides an elastic, parallel computing foundation for the stat_agg (statistical aggregates) package.",
"Xdata Git Location":"",
"License":"ALv2",
"Category":"Infrastructure"
},
{
"XData Team":"Phronesis",
"Software":"bigmemory",
"Internal Link":"",
"External Link":"http://bigmemory.org/",
"Public Code Repo":"http://cran.r-project.org/web/packages/bigmemory/index.html",
"Stats":"",
"Characteristics of the Software (what software does)":"Bigmemory is an R package to create, store, access, and manipulate massive matrices. Matrices are allocated to shared memory and may use memory-mapped files. Packages biganalytics, bigtabulate, synchronicity, and bigalgebra provide advanced functionality.",
"Xdata Git Location":"",
"License":"ALv2",
"Category":"Infrastructure"
},
{
"XData Team":"Phronesis",
"Software":"bigalgebra",
"Internal Link":"",
"External Link":"http://bigmemory.org/",
"Public Code Repo":"https://r-forge.r-project.org/scm/viewvc.php/?root=bigmemory",
"Stats":"",
"Characteristics of the Software (what software does)":"Bigalgebra is an R package that provides arithmetic functions for R matrix and big.matrix objects.",
"Xdata Git Location":"",
"License":"ALv2",
"Category":"Infrastructure"
},
{
"XData Team":"MDA Information Systems, Inc., Jet Propulsion Laboratory, USC/Information Sciences Institute",
"Software":"OODT",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/pages/viewpage.action?pageId=6717613",
"External Link":"http://oodt.apache.org/",
"Public Code Repo":"https://svn.apache.org/repos/asf/oodt/",
"Stats":"oodt",
"Characteristics of the Software (what software does)":"APACHE OODT enables transparent access to distributed resources, data discovery and query optimization, and distributed processing and virtual archives. OODT provides software architecture that enables models for information representation, solutions to knowledge capture problems, unification of technology, data, and metadata.",
"Xdata Git Location":"tools\\utilities\\oodt",
"License":"ALv2",
"Language (Primary)":"Java",
"Language (Secondary)":"",
"Category":"Infrastructure"
},
{
"XData Team":"MDA Information Systems, Inc.,Jet Propulsion Laboratory, USC/Information Sciences Institute",
"Software":"Wings",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/pages/viewpage.action?pageId=6717613",
"External Link":"http://www.wings-workflows.org/",
"Public Code Repo":"https://github.com/varunratnakar/wings.git",
"Stats":"wings",
"Characteristics of the Software (what software does)":"WINGS provides a semantic workflow system that assists scientists with the design of computational experiments. A unique feature of WINGS is that its workflow representations incorporate semantic constraints about datasets and workflow components, and are used to create and validate workflows and to generate metadata for new data products. WINGS submits workflows to execution frameworks such as Pegasus and OODT to run workflows at large scale in distributed resources.",
"Xdata Git Location":"tools\\utilities\\wings\\",
"License":"ALv2",
"Language (Primary)":"",
"Language (Secondary)":"",
"Category":"Infrastructure"
},
{
"XData Team":"MIT-LL",
"Software":"Query By Example (Graph QuBE)",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/XSW2013/MIT+Lincoln+Laboratory",
"External Link":"http://www.ll.mit.edu/mission/cybersec/HLT/HLT.html",
"Public Code Repo":"2014-02-15",
"Stats":"",
"Characteristics of the Software (what software does)":"Query-by-Example (Graph QuBE) on dynamic transaction graphs.",
"Xdata Git Location":"tools\\analytics\\mit-LL\\graph-qube",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"MIT-LL",
"Software":"Julia",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/XSW2013/MIT+Lincoln+Laboratory",
"External Link":"http://julialang.org/",
"Public Code Repo":"https://github.com/JuliaLang/julia.git",
"Stats":"julia",
"Characteristics of the Software (what software does)":"Julia is a high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematical function library.",
"Xdata Git Location":"tools\\analytics\\mit-LL",
"License":"MIT,GPL,LGPL,BSD",
"Category":"Analytics"
},
{
"XData Team":"MIT-LL",
"Software":"Topic",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/XSW2013/MIT+Lincoln+Laboratory",
"External Link":"http://www.ll.mit.edu/mission/cybersec/HLT/HLT.html",
"Public Code Repo":"Pending",
"Stats":"",
"Characteristics of the Software (what software does)":"Probabilistic Latent Semantic Analysis (pLSA) Topic Modeling.",
"Xdata Git Location":"tools\\analytics\\mit-LL",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"MIT-LL",
"Software":"SciDB",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/XSW2013/MIT+Lincoln+Laboratory",
"External Link":"http://scidb.org",
"Public Code Repo":"https://github.com/wujiang/SciDB-mirror.git",
"Stats":"SciDB-mirror",
"Characteristics of the Software (what software does)":"Scientific Database for large-scale numerical data.",
"Xdata Git Location":"tools\\analytics\\mit-LL",
"License":"GPLv3",
"Category":"Infrastructure"
},
{
"XData Team":"MIT-LL",
"Software":"Information Extractor",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/XSW2013/MIT+Lincoln+Laboratory",
"External Link":"http://www.ll.mit.edu/mission/cybersec/HLT/HLT.html",
"Public Code Repo":"Pending",
"Stats":"",
"Characteristics of the Software (what software does)":"Trainable named entity extractor (NER) and relation extractor.",
"Xdata Git Location":"tools\\analytics\\mit-LL\\tpoic",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"Next Century Corporation",
"Software":"Ozone Widget Framework",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/VIS/Neon+-+Next+Century",
"External Link":"http://owfgoss.org/download.html",
"Public Code Repo":"https://github.com/ozoneplatform/owf.git",
"Stats":"owf",
"Characteristics of the Software (what software does)":"Ozone Widget Framework provides a customizable open-source web application that assembles the tools you need to accomplish any task and enables those tools to communicate with each other. It is a technology-agnostic composition framework for data and visualizations in a common browser-based display and interaction environment that lowers the barrier to entry for the development of big data visualizations and enables efficient exploration of large data sets.",
"Xdata Git Location":"",
"License":"ALv2",
"Category":"Visualization"
},
{
"XData Team":"Next Century Corporation",
"Software":"Neon Visualization Environment",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/VIS/Neon+-+Next+Century",
"External Link":"http://neonframework.org/",
"Public Code Repo":"https://github.com/NextCenturyCorporation/neon.git",
"Stats":"neon",
"Characteristics of the Software (what software does)":"Neon is a framework that gives a datastore agnostic way for visualizations to query data and perform simple operations on that data such as filtering, aggregation, and transforms. It is divided into two parts, neon-server and neon-client. Neon-server provides a set of RESTful web services to select a datastore and perform queries and other operations on the data. Neon-client is a javascript API that provides a way to easily integrate neon-server capabilities into a visualization, and also aids in 'widgetizing' a visualization, allowing it to be integrated into a common OWF based ecosystem.",
"Xdata Git Location":"tools\\visualizations\\next-century",
"License":"ALv2",
"Category":"Visualization"
},
{
"XData Team":"Oculus Info Inc.",
"Software":"ApertureJS",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/VIS/Aperture+-+Oculus+Home+Page",
"External Link":"http://aperturejs.com/",
"Public Code Repo":"https://github.com/oculusinfo/aperturejs.git",
"Stats":"aperturejs",
"Characteristics of the Software (what software does)":"ApertureJS is an open, adaptable and extensible JavaScript visualization framework with supporting REST services, designed to produce visualizations for analysts and decision makers in any common web browser. Aperture utilizes a novel layer based approach to visualization assembly, and a data mapping API that simplifies the process of adaptable transformation of data and analytic results into visual forms and properties. Aperture vizlets can be easily embedded with full interoperability in frameworks such as the Ozone Widget Framework (OWF).",
"Xdata Git Location":"tools\\visualizations\\oculus\\aperture\\aperture",
"License":"MIT",
"Language (Primary)":"JavaScript",
"Language (Secondary)":"Java",
"Category":"Visualization",
"Screenshot":"http://www.oculusinfo.com/assets/images/aperturejs-ex.png"
},
{
"XData Team":"Oculus Info Inc.",
"Software":"Influent",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/VIS/Aperture+-+Oculus+Home+Page",
"External Link":"http://www.oculusinfo.com/influent",
"Public Code Repo":"https://github.com/oculusinfo/influent.git",
"Stats":"influent",
"Characteristics of the Software (what software does)":"Influent is an HTML5 tool for visually and interactively following transaction flow, rapidly revealing actors and behaviors of potential concern that might otherwise go unnoticed. Summary visualization of transactional patterns and actor characteristics, interactive link expansion and dynamic entity clustering enable Influent to operate effectively at scale with big data sources in any modern web browser. Influent has been used to explore data sets with millions of entities and hundreds of millions of transactions.",
"Xdata Git Location":"tools\\visualizations\\oculus\\aperture\\influent",
"License":"MIT",
"Language (Primary)":"JavaScript",
"Language (Secondary)":"Java",
"Category":"Visualization",
"Screenshot":"http://www.oculusinfo.com/assets/images/influent-ex.png"
},
{
"XData Team":"Oculus Info Inc.",
"Software":"Aperture Tile-Based Visual Analytics",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/VIS/Aperture+-+Oculus+Home+Page",
"External Link":"http://www.oculusinfo.com/tiles",
"Public Code Repo":"https://github.com/oculusinfo/aperture-tiles.git",
"Stats":"aperture-tiles",
"Characteristics of the Software (what software does)":"New tools for raw data characterization of 'big data' are required to suggest initial hypotheses for testing. The widespread use and adoption of web-based maps has provided a familiar set of interactions for exploring abstract large data spaces. Building on these techniques, we developed tile based visual analytics that provide browser-based interactive visualization of billions of data points.",
"Xdata Git Location":"tools\\visualizations\\oculus\\aperture\\demos\\xdata\\SC2013",
"License":"MIT",
"Language (Primary)":"JavaScript",
"Language (Secondary)":"Java",
"Category":"Visualization",
"Screenshot":"http://www.oculusinfo.com/assets/images/aperture-tiles-ex.png"
},
{
"XData Team":"Oculus Info Inc.",
"Software":"Oculus Ensemble Clustering",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/VIS/Aperture+-+Oculus+Home+Page",
"External Link":"https://github.com/oculusinfo/ensemble-clustering",
"Public Code Repo":"https://github.com/oculusinfo/ensemble-clustering.git",
"Stats":"ensemble-clustering",
"Characteristics of the Software (what software does)":"Oculus Ensemble Clustering is a flexible multi-threaded clustering library for rapidly constructing tailored clustering solutions that leverage the different semantic aspects of heterogeneous data. The library can be used on a single machine using multi-threading or distributed computing using Spark.",
"Xdata Git Location":"tools\\visualizations\\oculus\\aperture\\oculus-common",
"License":"MIT",
"Language (Primary)":"Java",
"Language (Secondary)":"",
"Category":"Analytics"
},
{
"XData Team":"Raytheon BBN",
"Software":"Content and Context-based Graph Analysis: PINT, Patterns in Near-Real Time",
"Software (Short)":"Graph Analysis: PINT, Patterns in Near-Real Time",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/ANL/Raytheon+BBN+Technologies",
"External Link":"https://github.com/plamenbbn/XDATA",
"Public Code Repo":"https://github.com/plamenbbn/XDATA.git",
"Stats":"XDATA",
"Characteristics of the Software (what software does)":"Patterns in Near-Real Time will take any corpus as input and quantify the strength of the query match to a SME-based process model, represent process model as a Directed Acyclic Graph (DAG), and then search and score potential matches.",
"Xdata Git Location":"tools\\analytics\\bbn\\graph-matching",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"Raytheon BBN",
"Software":"Content and Context-based Graph Analysis: NILS, Network Inference of Link Strength",
"Software (Short)":"Graph Analysis: NILS, Network Inference of Link Strength",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/ANL/Raytheon+BBN+Technologies",
"External Link":"https://github.com/plamenbbn/XDATA",
"Public Code Repo":"https://github.com/plamenbbn/XDATA.git",
"Stats":"XDATA",
"Characteristics of the Software (what software does)":"Network Inference of Link Strength will take any text corpus as input and quantify the strength of connections between any pair of entities. Link strength probabilities are computed via shortest path.",
"Xdata Git Location":"tools\\analytics\\bbn\\textstructure",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"Royal Caliber",
"Software":"GPU based Graphlab style Gather-Apply-Scatter (GAS) platform for quickly implementing and running graph algorithms",
"Software (Short)":"GAS Platform for Quickly Running Graph Algorithms",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/ANL/Royal+Caliber+LLC%2C+VertexAPI+for+GPU+Graph+Analytics",
"External Link":"https://github.com/RoyalCaliber/vertexAPI2",
"Public Code Repo":"https://github.com/RoyalCaliber/vertexAPI2.git",
"Stats":"vertexAPI2",
"Characteristics of the Software (what software does)":"Allows users to express graph algorithms as a series of Gather-Apply-Scatter (GAS) steps similar to GraphLab. Runs these vertex programs using a single or multiple GPUs - demonstrates a large speedup over GraphLab.",
"Xdata Git Location":"tools\\analytics\\RoyalCaliber",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"Scientific Systems Company, Inc., MIT, and University of Louisville",
"Software":"BayesDB",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/pages/viewpage.action?pageId=7274588",
"External Link":"http://probcomp.csail.mit.edu/bayesdb/",
"Public Code Repo":"https://github.com/mit-probabilistic-computing-project/BayesDB.git",
"Stats":"BayesDB",
"Characteristics of the Software (what software does)":"BayesDB is an open-source implementation of a predictive database table. It provides predictive extensions to SQL that enable users to query the implications of their data --- predict missing entries, identify predictive relationships between columns, and examine synthetic populations --- based on a Bayesian machine learning system in the backend. ",
"Xdata Git Location":"tools\\analytics\\ssci-mit-ul\\tabular-predDB",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"Scientific Systems Company, Inc., MIT, and University of Louisville",
"Software":"Crosscat",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/pages/viewpage.action?pageId=7274588",
"External Link":"http://probcomp.csail.mit.edu/crosscat/",
"Public Code Repo":"https://github.com/mit-probabilistic-computing-project/crosscat.git",
"Stats":"crosscat",
"Characteristics of the Software (what software does)":"CrossCat is a domain-general, Bayesian method for analyzing high-dimensional data tables. CrossCat estimates the full joint distribution over the variables in the table from the data via approximate inference in a hierarchical, nonparametric Bayesian model, and provides efficient samplers for every conditional distribution. CrossCat combines strengths of nonparametric mixture modeling and Bayesian network structure learning: it can model any joint distribution given enough data by positing latent variables, but also discovers independencies between the observable variables.",
"Xdata Git Location":"tools\\analytics\\ssci-mit-ul\\tabular-predDB",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"Sotera Defense Solutions, Inc.",
"Software":"Zephyr",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/ANL/Sotera+Defense+Systems+-+Goals+and+Analytics",
"External Link":"http://github.com/Sotera/zephyr",
"Public Code Repo":"http://github.com/Sotera/zephyr",
"Stats":"zephyr",
"Characteristics of the Software (what software does)":"Zephyr is a big data, platform agnostic ETL API, with Hadoop MapReduce, Storm, and other big data bindings.",
"Xdata Git Location":"tools\\utilities\\sotera\\zephyr; tools\\utilties\\sotera\\zephyr-xdata-ingest",
"License":"ALv2",
"Category":"Infrastructure"
},
{
"XData Team":"Sotera Defense Solutions, Inc.",
"Software":"Page Rank",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/ANL/Sotera+Defense+Systems+-+Goals+and+Analytics",
"External Link":"https://github.com/Sotera/page-rank",
"Public Code Repo":"https://github.com/Sotera/page-rank.git",
"Stats":"page-rank",
"Characteristics of the Software (what software does)":"Sotera Page Rank is a Giraph/Hadoop implementation of a distributed version of the Page Rank algorithm.",
"Xdata Git Location":"tools\\utilities\\sotera\\",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"Sotera Defense Solutions, Inc.",
"Software":"Louvain Modularity",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/ANL/Sotera+Defense+Systems+-+Goals+and+Analytics",
"External Link":"http://sotera.github.io/distributed-louvain-modularity/",
"Public Code Repo":"https://github.com/Sotera/distributed-louvain-modularity.git",
"Stats":"distributed-louvain-modularity",
"Characteristics of the Software (what software does)":"Giraph/Hadoop implementation of a distributed version of the Louvain community detection algorithm.",
"Xdata Git Location":"tools\\analytics\\sotera\\LouvainModularity\\giraph_1.0\\",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"Sotera Defense Solutions, Inc.",
"Software":"Spark MicroPath",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/ANL/Sotera+Defense+Systems+-+Goals+and+Analytics",
"External Link":"http://sotera.github.io/aggregate-micro-paths/",
"Public Code Repo":"https://github.com/Sotera/aggregate-micro-paths.git",
"Stats":"",
"Characteristics of the Software (what software does)":"The Spark implementation of the micropath analytic.",
"Xdata Git Location":"tools\\analytics\\sotera\\SparkMicroPath\\",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"Sotera Defense Solutions, Inc.",
"Software":"ARIMA",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/ANL/Sotera+Defense+Systems+-+Goals+and+Analytics",
"External Link":"https://github.com/Sotera/rhipe-arima",
"Public Code Repo":"https://github.com/Sotera/rhipe-arima",
"Stats":"rhipe-arima",
"Characteristics of the Software (what software does)":"Hive and RHIPE implementation of an ARIMA analytic.",
"Xdata Git Location":"tools\\analytics\\sotera\\arima\\",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"Sotera Defense Solutions, Inc.",
"Software":" Leaf Compression",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/ANL/Sotera+Defense+Systems+-+Goals+and+Analytics",
"External Link":"https://github.com/Sotera/leaf-compression",
"Public Code Repo":"https://github.com/Sotera/leaf-compression.git",
"Stats":"leaf-compression",
"Characteristics of the Software (what software does)":"Recursive algorithm to remove nodes from a network where degree centrality is 1.",
"Xdata Git Location":"tools\\analytics\\sotera\\leaf_compression",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"Sotera Defense Solutions, Inc.",
"Software":"Correlation Approximation",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/ANL/Sotera+Defense+Systems+-+Goals+and+Analytics",
"External Link":"https://github.com/Sotera/correlation-approximation",
"Public Code Repo":"https://github.com/Sotera/correlation-approximation",
"Stats":"correlation-approximation",
"Characteristics of the Software (what software does)":"Spark implementation of an algorithm to find highly correlated vectors using an approximation algorithm.",
"Xdata Git Location":"tools\\analytics\\sotera\\CorrelationApproximation",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"Stanford University - Boyd",
"Software":"QCML (Quadratic Cone Modeling Language)",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/ANL/Stanford+Convex+Optimization+Group",
"External Link":"https://github.com/cvxgrp/qcml",
"Public Code Repo":"https://github.com/cvxgrp/qcml.git",
"Stats":"qcml",
"Characteristics of the Software (what software does)":"Seamless transition from prototyping to code generation. Enable ease and expressiveness of convex optimization across scales with little change in code.",
"Xdata Git Location":"tools\\analytics\\stanford-b\\qcml",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"Stanford University - Boyd",
"Software":"PDOS (Primal-dual operator splitting)",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/ANL/Stanford+Convex+Optimization+Group",
"External Link":"https://github.com/cvxgrp/pdos",
"Public Code Repo":"https://github.com/cvxgrp/pdos.git",
"Stats":"pdos",
"Characteristics of the Software (what software does)":"Concise algorithm for solving convex problems; solves problems passed from QCML.",
"Xdata Git Location":"tools\\analytics\\stanford-b\\pdos",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"Stanford University - Boyd",
"Software":"SCS (Self-dual Cone Solver)",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/ANL/Stanford+Convex+Optimization+Group",
"External Link":"https://github.com/cvxgrp/scs",
"Public Code Repo":"https://github.com/cvxgrp/scs.git",
"Stats":"scs",
"Characteristics of the Software (what software does)":"Implementation of a solver for general cone programs, including linear, second-order, semidefinite and exponential cones, based on an operator splitting method applied to a self-dual homogeneous embedding. The method and software supports both direct factorization, with factorization caching, and an indirect method, that requires only the operator associated with the problem data and its adjoint. The implementation includes interfaces to CVX, CVXPY, matlab, as well as test routines. This code is described in detail in an associated paper, at http://www.stanford.edu/~boyd/papers/pdos.html (which also links to the code).",
"Xdata Git Location":"",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"Stanford University - Boyd",
"Software":"ECOS: An SOCP Solver for Embedded Systems",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/ANL/Stanford+Convex+Optimization+Group",
"External Link":"https://github.com/ifa-ethz/ecos",
"Public Code Repo":"https://github.com/ifa-ethz/ecos.git",
"Stats":"ecos",
"Characteristics of the Software (what software does)":"ECOS is a lightweight primal-dual homogeneous interior-point solver for SOCPs, for use in embedded systems as well as a base solver for use in large scale distributed solvers. It is described in the paper at http://www.stanford.edu/~boyd/papers/ecos.html.",
"Xdata Git Location":"",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"Stanford University - Boyd",
"Software":"Proximal Operators",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/ANL/Stanford+Convex+Optimization+Group",
"External Link":"https://github.com/cvxgrp/proximal",
"Public Code Repo":"https://github.com/cvxgrp/proximal.git",
"Stats":"ecos",
"Characteristics of the Software (what software does)":"This library contains sample implementations of various proximal operators in Matlab. These implementations are intended to be pedagogical, not the most performant. This code is associated with the paper Proximal Algorithms by Neal Parikh and Stephen Boyd.",
"Xdata Git Location":"",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"Stanford University - Hanrahan",
"Software":"imMens",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/pages/viewpage.action?pageId=7274499",
"External Link":"http://vis.stanford.edu/projects/immens/",
"Public Code Repo":"https://github.com/StanfordHCI/imMens.git",
"Stats":"imMens",
"Characteristics of the Software (what software does)":"imMens is a web-based system for interactive visualization of large databases. imMens uses binned aggregation to produce summary visualizations that avoid the shortcomings of standard sampling-based approaches. Through data decomposition methods (to limit data transfer) and GPU computation via WebGL (for parallel query processing), imMens enables real-time (50fps) visual querying of billion+ element databases.",
"Xdata Git Location":"tools\\analytics\\stanford-h",
"License":"BSD",
"Category":"Visualization"
},
{
"XData Team":"Stanford University - Hanrahan",
"Software":"trelliscope",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/pages/viewpage.action?pageId=7274499",
"External Link":"http://hafen.github.io/trelliscope/",
"Public Code Repo":"https://github.com/hafen/trelliscope.git",
"Stats":"trelliscope",
"Characteristics of the Software (what software does)":"Trellis Display, developed in the 90s, also divides the data. A visualization method is applied to each subset and shown on one panel of a multi-panel trellis display. This framework is a very powerful mechanism for all data, large and small. Trelliscope, a layer that uses datadr, extends Trellis to large complex data. An interactive viewer is available for viewing subsets of very large displays, and the software provides the capability to sample subsets of panels from rigorous sampling plans. Sampling is often necessary because in most applications, there are too many subsets to look at them all.",
"Xdata Git Location":"tools\\analytics\\stanford-h",
"License":"BSD",
"Category":"Visualization"
},
{
"XData Team":"Stanford University - Hanrahan",
"Software":"RHIPE: R and Hadoop Integrated Programming Environment",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/pages/viewpage.action?pageId=7274499",
"External Link":"http://www.datadr.org/",
"Public Code Repo":"https://github.com/saptarshiguha/RHIPE.git",
"Stats":"RHIPE",
"Characteristics of the Software (what software does)":"In Divide and Recombine (D&R), big data are divided into subsets in one or more ways, forming divisions. Analytic methods, numeric-categorical methods of machine learning and statistics plus visualization methods, are applied to each of the subsets of a division. Then the subset outputs for each method are recombined. D&R methods of division and recombination seek to make the statistical accuracy of recombinations as large as possible, ideally close to that of the hypothetical direct, all-data application of the methods. The D&R computational environment starts with RHIPE, a merger of R and Hadoop. RHIPE allows an analyst to carry out D&R analysis of big data wholly from within R, and use any of the thousands of methods available in R. RHIPE communicates with Hadoop to carry out the big, parallel computations.",
"Xdata Git Location":"tools\\analytics\\stanford-h",
"License":"ALv2",
"Category":"Infrastructure"
},
{
"XData Team":"Stanford University - Hanrahan",
"Software":"Riposte",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/pages/viewpage.action?pageId=7274499",
"External Link":"https://github.com/jtalbot/riposte",
"Public Code Repo":"https://github.com/jtalbot/riposte.git",
"Stats":"riposte",
"Characteristics of the Software (what software does)":"Riposte is a fast interpreter and JIT for R. The Riposte VM has 2 cooperative subVMs for R scripting (like Java) and for R vector computation (like APL). Our scripting code has been 2-4x faster in Riposte than in R's recent bytecode interpreter. Vector-heavy code is 5-10x faster. Speeding up R can greatly increases the analyst's efficiency.",
"License":"BSD",
"Category":"Analytics"
},
{
"XData Team":"Stanford University - Olukotun",
"Software":"Delite",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/XSW2013/Stanford+XGraph",
"External Link":"https://github.com/stanford-ppl/delite",
"Public Code Repo":"https://github.com/stanford-ppl/Delite.git",
"Stats":"Delite",
"Characteristics of the Software (what software does)":"Delite is a compiler framework and runtime for parallel embedded domain-specific languages (DSLs).",
"Xdata Git Location":"tools\\analytics\\stanford-o",
"License":"BSD",
"Category":"Infrastructure"
},
{
"XData Team":"Stanford University - Olukotun",
"Software":"SNAP",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/XSW2013/Stanford+XGraph",
"External Link":"https://snap.stanford.edu/",
"Public Code Repo":"https://github.com/snap-stanford/snap",
"Stats":"snap",
"Characteristics of the Software (what software does)":"Stanford Network Analysis Platform (SNAP) is a general purpose network analysis and graph mining library. It is written in C++ and easily scales to massive networks with hundreds of millions of nodes, and billions of edges. It efficiently manipulates large graphs, calculates structural properties, generates regular and random graphs, and supports attributes on nodes and edges.",
"Xdata Git Location":"tools\\analytics\\stanford-o",
"License":"BSD",
"Category":"Infrastructure"
},
{
"XData Team":"SYSTAP, LLC",
"Software":"bigdata",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/ANL/SYSTAP%2C+LLC",
"External Link":"http://sourceforge.net/projects/bigdata/",
"Public Code Repo":"https://bigdata.svn.sourceforge.net/svnroot/bigdata/",
"Stats":"bigdata",
"Characteristics of the Software (what software does)":"Bigdata enables massively parallel graph processing on GPUs and many core CPUs. The approach is based on the decomposition of a graph algorithm as a vertex program. The initial implementation supports an API based on the GraphLab 2.1 Gather Apply Scatter (GAS) API. Execution is available on GPUs, Intel Xenon Phi (aka MIC), and multi-core GPUs. ",
"Xdata Git Location":"tools\\analytics\\SYSTAP",
"License":"GPLv2",
"Category":"Infrastructure"
},
{
"XData Team":"SYSTAP, LLC",
"Software":"mpgraph",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/ANL/SYSTAP%2C+LLC",
"External Link":"https://sourceforge.net/projects/mpgraph/",
"Public Code Repo":"http://svn.code.sf.net/p/mpgraph/code/",
"Stats":"mpgraph",
"Characteristics of the Software (what software does)":"Mpgraph enables massively parallel graph processing on GPUs and many core CPUs. The approach is based on the decomposition of a graph algorithm as a vertex program. The initial implementation supports an API based on the GraphLab 2.1 Gather Apply Scatter (GAS) API. Execution is available on GPUs, Intel Xenon Phi (aka MIC), and multi-core GPUs. ",
"Xdata Git Location":"tools\\analytics\\SYSTAP",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"UC Davis",
"Software":"Gunrock",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/ANL/SYSTAP%2C+LLC",
"External Link":"http://gunrock.github.io/gunrock/",
"Public Code Repo":"https://github.com/gunrock/gunrock.git",
"Stats":"gunrock",
"Characteristics of the Software (what software does)":"Gunrock is a CUDA library for graph primitives that refactors, integrates, and generalizes best-of-class GPU implementations of breadth-first search, connected components, and betweenness centrality into a unified code base useful for future development of high-performance GPU graph primitives.",
"Xdata Git Location":"tools\\analytics\\SYSTAP",
"License":"ALv2",
"Category":"Analytics"
},
{
"XData Team":"Draper Laboratory",
"Software":"Analytic Activity Logger",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/display/XSW2013/draper@summerCamp2013",
"External Link":"https://github.com/draperlab/xdatalogger",
"Public Code Repo":"https://github.com/draperlab/xdatalogger.git",
"Stats":"xdatalogger",
"Characteristics of the Software (what software does)":"Analytic Activity Logger is an API that creates a common message passing interface to allow heterogeneous software components to communicate with an activity logging engine. Recording a user's analytic activities enables estimation of operational context and workflow. Combined with psychophysiology sensing, analytic activity logging further enables estimation of the user's arousal, cognitive load, and engagement with the tool.",
"Xdata Git Location":"tools\\utilities\\draper",
"License":"ALv2",
"Language (Primary)":"Javascript",
"Language (Secondary)":"",
"Category":"Infrastructure"
},
{
"XData Team":"University of California, Berkeley",
"Software":"BDAS",
"Internal Link":"N/A",
"External Link":"https://amplab.cs.berkeley.edu/software/",
"Public Code Repo":"N/A",
"Stats":"",
"Characteristics of the Software (what software does)":"BDAS, the Berkeley Data Analytics Stack, is an open source software stack that integrates software components being built by the AMPLab to make sense of Big Data.",
"Xdata Git Location":"N/A",
"License":"ALv2, BSD",
"Category":"Infrastructure"
},
{
"XData Team":"University of California, Berkeley",
"Software":"Spark",
"Internal Link":"N/A",
"External Link":"http://spark.incubator.apache.org/",
"Public Code Repo":"https://git-wip-us.apache.org/repos/asf/incubator-spark.git",
"Stats":"incubator-spark",
"Characteristics of the Software (what software does)":"Apache Spark is an open source cluster computing system that aims to make data analytics both fast to run and fast to write. To run programs faster, Spark offers a general execution model that can optimize arbitrary operator graphs, and supports in-memory computing, which lets it query data faster than disk-based engines like Hadoop. To make programming faster, Spark provides clean, concise APIs in Python, Scala and Java. You can also use Spark interactively from the Scala and Python shells to rapidly query big datasets.",
"Xdata Git Location":"N/A",
"License":"ALv2",
"Category":"Infrastructure"
},
{
"XData Team":"University of California, Berkeley",
"Software":"Shark",
"Internal Link":"N/A",
"External Link":"https://github.com/amplab/shark",
"Public Code Repo":"https://github.com/amplab/shark.git",
"Stats":"shark",
"Characteristics of the Software (what software does)":"Shark is a large-scale data warehouse system for Spark that is designed to be compatible with Apache Hive. It can execute Hive QL queries up to 100 times faster than Hive without any modification to the existing data or queries. Shark supports Hive's query language, metastore, serialization formats, and user-defined functions, providing seamless integration with existing Hive deployments and a familiar, more powerful option for new ones.",
"Xdata Git Location":"N/A",
"License":"ALv2",
"Category":"Infrastructure"
},
{
"XData Team":"University of California, Berkeley",
"Software":"BlinkDB",
"Internal Link":"N/A",
"External Link":"http://blinkdb.org/",
"Public Code Repo":"https://github.com/sameeragarwal/blinkdb.git",
"Stats":"blinkdb",
"Characteristics of the Software (what software does)":"BlinkDB is a massively parallel, approximate query engine for running interactive SQL queries on large volumes of data. It allows users to trade-off query accuracy for response time, enabling interactive queries over massive data by running queries on data samples and presenting results annotated with meaningful error bars. To achieve this, BlinkDB uses two key ideas: (1) An adaptive optimization framework that builds and maintains a set of multi-dimensional samples from original data over time, and (2) A dynamic sample selection strategy that selects an appropriately sized sample based on a query's accuracy and/or response time requirements. We have evaluated BlinkDB on the well-known TPC-H benchmarks, a real-world analytic workload derived from Conviva Inc. and are in the process of deploying it at Facebook Inc. ",
"Xdata Git Location":"N/A",
"License":"ALv2",
"Category":"Infrastructure"
},
{
"XData Team":"University of California, Berkeley",
"Software":"Mesos",
"Internal Link":"N/A",
"External Link":"http://mesos.apache.org/",
"Public Code Repo":"https://git-wip-us.apache.org/repos/asf/mesos.git",
"Stats":"mesos",
"Characteristics of the Software (what software does)":"Apache Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks. It can run Hadoop, MPI, Hypertable, Spark, and other applications on a dynamically shared pool of nodes.",
"Xdata Git Location":"N/A",
"License":"ALv2",
"Category":"Infrastructure"
},
{
"XData Team":"University of California, Berkeley",
"Software":"Tachyon",
"Internal Link":"N/A",
"External Link":"https://github.com/amplab/tachyon",
"Public Code Repo":"https://github.com/amplab/tachyon.git",
"Stats":"tachyon",
"Characteristics of the Software (what software does)":"Tachyon is a fault tolerant distributed file system enabling reliable file sharing at memory-speed across cluster frameworks, such as Spark and MapReduce. It achieves high performance by leveraging lineage information and using memory aggressively. Tachyon caches working set files in memory, and enables different jobs/queries and frameworks to access cached files at memory speed. Thus, Tachyon avoids going to disk to load datasets that are frequently read.",
"Xdata Git Location":"N/A",
"License":"BSD",
"Category":"Infrastructure"
},
{
"XData Team":"University of Southern California",
"Software":"goffish",
"Internal Link":"https://xd-wiki.xdata.data-tactics-corp.com:8443/pages/viewpage.action?pageId=6717524",
"External Link":"https://github.com/usc-cloud/goffish",
"Public Code Repo":"https://github.com/usc-cloud/goffish.git",
"Stats":"goffish",
"Characteristics of the Software (what software does)":"The GoFFish project offers a distributed framework for storing timeseries graphs and composing graph analytics. It takes a clean-slate approach that leverages best practices and patterns from scalable data analytics such as Hadoop, HDFS, Hive, and Giraph, but with an emphasis on performing native analytics on graph (rather than tuple) data structures. This offers an more intuitive storage, access and programming model for graph datasets while also ensuring performance optimized for efficient analysis over large graphs (millions-billions of vertices) and many instances of them (thousands-millions of graph instances).",
"Xdata Git Location":"N/A",
"License":"ALv2",
"Category":"Infrastructure"
}
]